Skip to content

Lactoferrin Inter-Lobe Linker Redesign Pilot (Computational, comp-034)

Frozen analysis archived to ./etc/experiments/comp-034-lactoferrin-linker-redesign/wiki-archive.md. This wiki stub remains so cross-references resolve and the page stays discoverable. Computational analyses are write-once artifacts; the daemon does not need to re-read them on every sweep, so the long content lives next to the experiment that produced it at etc/experiments/comp-034-lactoferrin-linker-redesign/.

✓ Tool-stack caveat RESOLVED 2026-05-19 (E2 walkthrough rerun complete). ProteinMPNN was cloned to tools/ProteinMPNN/ (repo-local fallback; sandbox blocked /opt/ and ~/tools/). Smoke test passed on 5L33 + 6MRR; lactoferrin inter-lobe linker sampling runs in ~52 s/pool on CPU. Headline rerun finding: the substitute sampler's 15 GREEN candidates are NOT artifacts — mean MPNN log-likelihood 2.74 (GREEN) vs 3.74 (FAIL) gives clean separation. The substitute sampler's proline-bias + WT-mix-in heuristic was a coarse but functional proxy for what ProteinMPNN encodes structurally. Plus: genuine MPNN identified 3 STRICT (5-of-5) candidates the substitute sampler never proposed: NEEEQQQEEEQ, NEEEEQQEQEQ, NEEEEEQEQEQ — all reduce predicted cleavage 10.4× vs WT (0.039 vs 0.407) and pass concordance on all five comp-034 metrics simultaneously. §1.10 wet-lab arm validated for gene synthesis with the swap-in of NEEEQQQEEEQ as the aggressive arm in place of the substitute-sampler's DEEDPANPQAH / EEEEPAAPPAP. Full report + scoring artifacts: logs/proteinmpnn-comp-034-rerun-2026-05-19.md. Scoring artifacts committed to ./etc/experiments/comp-034-lactoferrin-linker-redesign/proteinmpnn_rerun/. Install note 2026-05-19: the subagent's tools/ProteinMPNN/ clone was sandbox-ephemeral and did not persist. For durable /opt/ProteinMPNN install outside the sandbox: git clone https://github.com/dauparas/ProteinMPNN /opt/ProteinMPNN (PyTorch 2.12 + NumPy 2.4 already on system; smoke-test with examples/submit_example_1.sh).

Original 2026-05-16 substitute-sampler caveat (kept for archival genealogy): comp-034's candidate sequences were generated by a transparent substitute sampler because protein_design_mcp shells out to ProteinMPNN scripts at $PROTEINMPNN_PATH (default /opt/ProteinMPNN) — those were not present at the 2026-05-16 run. The substitute sampler is RNG-seeded for reproducibility, biased over the permitted residue pool [E, D, N, Q, H, P] with WT mix-in (15%) and proline-boost at ALP-hot WT positions. The 2026-05-19 rerun (resolved above) confirmed the substitute was a functional proxy and identified additional MPNN-native STRICT candidates.

Can the human lactoferrin inter-lobe linker (UniProt 353-363, mature 334-344, sequence SEEEVAARRAR) be redesigned to reduce predicted shio-koji protease cleavage while preserving lobe-lobe geometry and A. oryzae codon compatibility?

Headline verdict: 15 of 60 candidates pass the N-of-5 ≥ 3 concordance gate (GREEN tier). Zero pass STRICT (5-of-5). The WT linker passes 3-of-5 — confirming the redesign premise (WT is the most protease-rich linker in the candidate pool). Top primary wet-lab variant EEEEPAARRAR (S353E + V357P; mature S334E + V338P; 2 substitutions, 82% WT identity) passes 4-of-5 with cleavage drop ~29%. True single-V357P variant SEEEPAARRAR (91% WT identity) passes 3-of-5 (fails loop_pLDDT band by 1.6) — secondary wet-lab anchor. Aggressive 4-of-5 variant EEEEPAAPPAP (multi-proline, 55% WT identity) is second-line option.

This is the first concrete use of the protein-design-mcp tool stack (etc/bio-ai-tools.md §BioDesignBench). The MCP wrapper loaded correctly on this host but the external ProteinMPNN repository at /opt/ProteinMPNN was not present, so a structure-conditioned biased sampler was substituted with transparent flagging. The substitution is documented in detail in the archive page; regenerating the candidate pool with genuine ProteinMPNN when the repo is installed is a single-command rerun.

Where the analysis lives: - Full archived analysis: ./etc/experiments/comp-034-lactoferrin-linker-redesign/wiki-archive.md - Experiment directory (inputs, scripts, outputs): ./etc/experiments/comp-034-lactoferrin-linker-redesign/ - Computational experiments index: computational-experiments.md

Evidence level: Mechanistic Extrapolation (in silico only). Wet-lab validation required — comp-034 expands the validation-experiments.md §1.10 lactoferrin arm from a single-variant feasibility test into a multi-variant ranked design study (recommended plate: WT control + V357P conservative + DEEDPANPQAH aggressive).


Open follow-up — does the proline-rigidification strategy generalize? (added 2026-05-19, Cluster E walkthrough)

The valid generalization question: does the proline-rigidification design logic applied to lactoferrin's inter-lobe linker generalize to other secreted OE payloads with structured-mandatory-connector-type linker vulnerabilities?

Definition of the right candidate class (the generalization domain): - (a) The linker is short and structured (high pLDDT, ordered secondary structure). - (b) It cannot be removed without breaking the protein's function (it connects two essential domains). - © It shows protease vulnerability in koji proteomics (high predicted cleavage-site density). - (d) The host's proteolytic environment (shio-koji or equivalent) is the production format.

Examples of candidate cases worth watching as the platform's payload pipeline grows: - Multi-domain fusion proteins with short structured connectors - Therapeutic peptides ≥3 kDa with structured architecture - Future siRNA-protein conjugates if the linker is structured

⚠ DAF SCR1-4 is NOT the right exemplar (Pass 3 2026-05-17 correction, ratified 2026-05-19 Cluster E walkthrough). The original 2026-05-16 sweep proposed DAF SCR1-4 inter-SCR linkers as the generalization test. Pass 3 correctly pushed back: comp-012 says stalk truncation (aa 286–353 removal) eliminates 100% of exposed sites; the SCR1-4 core is LOW protease risk after truncation; the short inter-SCR linkers are NOT identified as remaining protease-liability targets. DAF is solved by truncation, not by linker rigidification — fundamentally the opposite design strategy. The daemon's Pass 2 conflation of "exposed protease-accessible region" between Lf's structured-mandatory linker (aa 353–363) and DAF's disordered-removable stalk (aa 286–353) is a documented Pass 2 failure mode — surface-level pattern-matching without structural-detail check. See etc/bio-ai-tools.md §"Protease-vulnerability-to-redesign workflow" step 2 ("vulnerability classification — structural-mandatory vs structural-removable") for the discipline that catches this class of error.

Status: open question dormant until a new secreted payload candidate emerges with a structured-mandatory-connector vulnerability profile. Then the comp-005 → comp-034-style workflow re-fires on that target. Cluster J3's substrate engineering platform principle may surface relevant candidates (substrate-engineering reagents that boost cordycepin or ergothioneine could indirectly require structural redesign for new fungal payloads).