comp-010: Cassette Compatibility — Dual-Cassette Koji Endgame Strain¶
Question: Does the uricase (Q00511) + lactoferrin (P02788) payload pair have any cassette-design-specific issues — codon collisions, KEX2 site geometry problems, or secretion-pathway burden — that the Ward 1995 glucoamylase-KEX2 architecture will not handle out of the box?
Verdict: LOW overall cassette-design risk (for the proposed direct-secretion/fusion asymmetric architecture)
No blocking cassette-design issues identified. Uricase (direct-secretion cassette, fungal origin, 0 disulfides) has no KEX2 fusion concerns and negligible codon-optimization burden. Lactoferrin's two internal K-R sites are either non-functional (P1'=D, cleavage abolished) or moderate-risk (P1'=K, reduced efficiency) — no high-risk truncation sites. Disulfide folding load (17 disulfides total, all from Lf) is 1.06× the Huynh 2020 adalimumab precedent — within demonstrated A. oryzae capacity. One informational finding: uricase has 1 internal KR site (residue 128, P1'=N, high-risk if in fusion); irrelevant for the proposed direct-secretion architecture.
Informs: validation-experiments.md §1.9 — Ward 1995 dual-cassette feasibility test; comp-010 supports the §1.9 design and removes cassette-architecture as a pre-experiment concern
Interpretive wiki page: wiki/cassette-compatibility-computational.md
Related experiments: comp-001 (uricase protease stability, LOW) | comp-005 (lactoferrin protease stability, MODERATE)
How to reproduce¶
No external packages required (stdlib only: json, math, pathlib). Outputs land in outputs/.
File index¶
comp-010-cassette-compatibility/
analyze.py ← analysis script (run this)
inputs/
Q00511.fasta ← A. flavus uricase (UniProt, 302 aa; no signal peptide)
P02788.fasta ← human lactoferrin (UniProt, 710 aa including signal peptide aa 1-19)
glucoamylase_carrier.fasta ← A. awamori glucoamylase P69327 (Ward 1995 / Huynh 2020 carrier)
a_oryzae_codon_usage.json ← A. oryzae RIB40 codon usage (RSCU + freq/1000; Kazusa database)
kex2_site_specs.json ← KEX2 cleavage rules (KR↓X, P1' preferences, KRGGG linker)
provenance.md ← sources, fetch dates, citations for every input
outputs/ ← generated by analyze.py; committed as artifacts
cassette_analysis.json ← machine-readable full analysis
summary.md ← human-readable; cited in the wiki
README.md ← this file
Seven analyses¶
| # | Analysis | Uricase verdict | Lactoferrin verdict |
|---|---|---|---|
| 1 | Codon usage (CAI proxy + hotspot scan) | LOW burden | LOW (proxy); full codon-opt required in practice |
| 2 | KEX2 site geometry | HIGH (1 site, residue 128) — not load-bearing (direct-secretion cassette) | MODERATE (1 site, residue 579, P1'=K; 1 abolished, residue 38, P1'=D) |
| 3 | Secretion-targeting signals | MODERATE (C-terminal SKL resembles PTS1 — verify in vivo) | LOW (no routing issues) |
| 4 | Disulfide load | VERY LOW (0 disulfides, 0× Huynh baseline) | MODERATE (17 disulfides, 1.06× Huynh 2020 baseline) |
| 5 | N-glycosylation sites | 1 predicted (NFS at pos 191; not UniProt-annotated — unlikely occupied) | 3 predicted, all UniProt-annotated (N137, N478, N623) |
| 6 | Combined secretion burden | LOW (no concurrent blocking factors) | — |
| 7 | Huynh 2020 comparison | EASIER (fungal origin) | HARDER (solid-state format); COMPARABLE (disulfide load, host strain) |
Key results¶
| Scope | Verdict | Primary driver |
|---|---|---|
| Overall cassette-design risk | LOW | No blocking KEX2 or fold issues in proposed architecture |
| Uricase cassette | LOW | Fungal origin, 0 disulfides, direct-secretion design |
| Lactoferrin cassette | MODERATE on KEX2 | 1 moderate-risk internal KR site (watch by SDS-PAGE); 17 disulfides within Huynh precedent |
| Titer gap vs. Huynh 2020 | 12.6× for Lf target | Huynh 39.7 mg/L adalimumab vs. Ward 1995 >2 g/L Lf — Ward is the correct benchmark |
Disagreement protocol¶
If you reproduce the outputs and disagree with the methods or numbers, file a GitHub issue referencing this folder (comp-010-cassette-compatibility). Primary candidates for revision: KEX2 P1' classification (based on S. cerevisiae Kex2p — A. oryzae kexB specificity may differ); codon usage table (Kazusa genome-wide average, ±15% RSCU error); signal peptide boundary for uricase (no canonical SP annotated — verify by SignalP or equivalent); peroxisomal PTS1 risk assessment for C-terminal SKL in uricase.