Lactoferrin Inter-Lobe Linker Redesign Pilot (Computational, comp-034)¶
Frozen analysis archived to
./etc/experiments/comp-034-lactoferrin-linker-redesign/wiki-archive.md. This wiki stub remains so cross-references resolve and the page stays discoverable. Computational analyses are write-once artifacts; the daemon does not need to re-read them on every sweep, so the long content lives next to the experiment that produced it atetc/experiments/comp-034-lactoferrin-linker-redesign/.✓ Tool-stack caveat RESOLVED 2026-05-19 (E2 walkthrough rerun complete). ProteinMPNN was cloned to
tools/ProteinMPNN/(repo-local fallback; sandbox blocked/opt/and~/tools/). Smoke test passed on5L33+6MRR; lactoferrin inter-lobe linker sampling runs in ~52 s/pool on CPU. Headline rerun finding: the substitute sampler's 15 GREEN candidates are NOT artifacts — mean MPNN log-likelihood 2.74 (GREEN) vs 3.74 (FAIL) gives clean separation. The substitute sampler's proline-bias + WT-mix-in heuristic was a coarse but functional proxy for what ProteinMPNN encodes structurally. Plus: genuine MPNN identified 3 STRICT (5-of-5) candidates the substitute sampler never proposed:NEEEQQQEEEQ,NEEEEQQEQEQ,NEEEEEQEQEQ— all reduce predicted cleavage 10.4× vs WT (0.039 vs 0.407) and pass concordance on all five comp-034 metrics simultaneously. §1.10 wet-lab arm validated for gene synthesis with the swap-in ofNEEEQQQEEEQas the aggressive arm in place of the substitute-sampler'sDEEDPANPQAH/EEEEPAAPPAP. Full report + scoring artifacts:logs/proteinmpnn-comp-034-rerun-2026-05-19.md. Scoring artifacts committed to./etc/experiments/comp-034-lactoferrin-linker-redesign/proteinmpnn_rerun/. Install note 2026-05-19: the subagent'stools/ProteinMPNN/clone was sandbox-ephemeral and did not persist. For durable/opt/ProteinMPNNinstall outside the sandbox:git clone https://github.com/dauparas/ProteinMPNN /opt/ProteinMPNN(PyTorch 2.12 + NumPy 2.4 already on system; smoke-test withexamples/submit_example_1.sh).Original 2026-05-16 substitute-sampler caveat (kept for archival genealogy): comp-034's candidate sequences were generated by a transparent substitute sampler because
protein_design_mcpshells out to ProteinMPNN scripts at$PROTEINMPNN_PATH(default/opt/ProteinMPNN) — those were not present at the 2026-05-16 run. The substitute sampler is RNG-seeded for reproducibility, biased over the permitted residue pool[E, D, N, Q, H, P]with WT mix-in (15%) and proline-boost at ALP-hot WT positions. The 2026-05-19 rerun (resolved above) confirmed the substitute was a functional proxy and identified additional MPNN-native STRICT candidates.
Can the human lactoferrin inter-lobe linker (UniProt 353-363, mature 334-344, sequence
SEEEVAARRAR) be redesigned to reduce predicted shio-koji protease cleavage while
preserving lobe-lobe geometry and A. oryzae codon compatibility?
Headline verdict: 15 of 60 candidates pass the N-of-5 ≥ 3 concordance gate (GREEN tier).
Zero pass STRICT (5-of-5). The WT linker passes 3-of-5 — confirming the redesign premise
(WT is the most protease-rich linker in the candidate pool). Top primary wet-lab variant
EEEEPAARRAR (S353E + V357P; mature S334E + V338P; 2 substitutions, 82% WT identity) passes
4-of-5 with cleavage drop ~29%. True single-V357P variant SEEEPAARRAR (91% WT identity)
passes 3-of-5 (fails loop_pLDDT band by 1.6) — secondary wet-lab anchor. Aggressive
4-of-5 variant EEEEPAAPPAP (multi-proline, 55% WT identity) is second-line option.
This is the first concrete use of the protein-design-mcp tool stack (etc/bio-ai-tools.md §BioDesignBench). The MCP wrapper loaded correctly on this
host but the external ProteinMPNN repository at /opt/ProteinMPNN was not present, so a
structure-conditioned biased sampler was substituted with transparent flagging. The
substitution is documented in detail in the archive page; regenerating the candidate pool
with genuine ProteinMPNN when the repo is installed is a single-command rerun.
Where the analysis lives:
- Full archived analysis: ./etc/experiments/comp-034-lactoferrin-linker-redesign/wiki-archive.md
- Experiment directory (inputs, scripts, outputs): ./etc/experiments/comp-034-lactoferrin-linker-redesign/
- Computational experiments index: computational-experiments.md
Evidence level: Mechanistic Extrapolation (in silico only). Wet-lab validation
required — comp-034 expands the validation-experiments.md §1.10 lactoferrin arm from a single-variant feasibility test into a multi-variant
ranked design study (recommended plate: WT control + V357P conservative + DEEDPANPQAH
aggressive).
Open follow-up — does the proline-rigidification strategy generalize? (added 2026-05-19, Cluster E walkthrough)¶
The valid generalization question: does the proline-rigidification design logic applied to lactoferrin's inter-lobe linker generalize to other secreted OE payloads with structured-mandatory-connector-type linker vulnerabilities?
Definition of the right candidate class (the generalization domain): - (a) The linker is short and structured (high pLDDT, ordered secondary structure). - (b) It cannot be removed without breaking the protein's function (it connects two essential domains). - © It shows protease vulnerability in koji proteomics (high predicted cleavage-site density). - (d) The host's proteolytic environment (shio-koji or equivalent) is the production format.
Examples of candidate cases worth watching as the platform's payload pipeline grows: - Multi-domain fusion proteins with short structured connectors - Therapeutic peptides ≥3 kDa with structured architecture - Future siRNA-protein conjugates if the linker is structured
⚠ DAF SCR1-4 is NOT the right exemplar (Pass 3 2026-05-17 correction, ratified 2026-05-19 Cluster E walkthrough). The original 2026-05-16 sweep proposed DAF SCR1-4 inter-SCR linkers as the generalization test. Pass 3 correctly pushed back: comp-012 says stalk truncation (aa 286–353 removal) eliminates 100% of exposed sites; the SCR1-4 core is LOW protease risk after truncation; the short inter-SCR linkers are NOT identified as remaining protease-liability targets. DAF is solved by truncation, not by linker rigidification — fundamentally the opposite design strategy. The daemon's Pass 2 conflation of "exposed protease-accessible region" between Lf's structured-mandatory linker (aa 353–363) and DAF's disordered-removable stalk (aa 286–353) is a documented Pass 2 failure mode — surface-level pattern-matching without structural-detail check. See etc/bio-ai-tools.md §"Protease-vulnerability-to-redesign workflow" step 2 ("vulnerability classification — structural-mandatory vs structural-removable") for the discipline that catches this class of error.
Status: open question dormant until a new secreted payload candidate emerges with a structured-mandatory-connector vulnerability profile. Then the comp-005 → comp-034-style workflow re-fires on that target. Cluster J3's substrate engineering platform principle may surface relevant candidates (substrate-engineering reagents that boost cordycepin or ergothioneine could indirectly require structural redesign for new fungal payloads).