Generalize Local Connector Bounded Reads
tasks10/10
1. Contract
- Add the filesystem/local-DB bounded-read requirement to
local-agent-collector-completeness. - Add a manifest- or registry-driven regression guard for local connector whole-file and unbounded
.all()reads. - Add reviewed exceptions for small per-artifact reads with explicit reasons.
- Add a reviewed accumulator requirement for logical-unit summaries that must not retain raw source payloads.
2. High-Risk Connectors
- Convert
imessagelocal database reads from unbounded.all()to row iteration. - Convert
twitter_archivearchive parsing away from whole-file array materialization. Done:connectors/twitter_archive/archive-stream.tsstreamstweets.js/tweet.js/direct-messages.jswithcreateReadStream+ the vetted dependency-free@streamparser/jsonparser (paths: ['$.*'],keepStack: falsereleases each emitted element). The tworeadFileexceptions were removed from the guard and it still passes. On-disk fixtures with escaped/nested/unicode cases under__fixtures__/archive-files/; streaming-equivalence, chunk-boundary, legacy-fallback, empty/missing/malformed, and end-to-end subprocess tests inarchive-stream.test.ts. The prior blocker is resolved; seeresearch/twitter-archive-streaming-blocker-2026-06-17.md(Resolution section). - Convert large Slack dump row reads to row iteration or document bounded query exceptions.
3. Validation
- Run targeted polyfill connector tests for changed connectors.
- Run
pnpm --filter @pdpp/polyfill-connectors typecheck. - Run
openspec validate generalize-local-connector-bounded-reads --strict.