Skip to content

comp-011: Cassette Compatibility — C. utilis Uricase + Lactoferrin in A. oryzae

Question: Does the C. utilis uricase (P78609) + lactoferrin (P02788) payload pair have any cassette-design-specific issues — codon collisions, KEX2 site geometry problems, or secretion-pathway burden — that the Ward 1995 glucoamylase-KEX2 architecture will not handle out of the box in A. oryzae?

Verdict: MODERATE overall cassette-design risk — not a fundamental incompatibility; three manageable design requirements vs. comp-010 (A. flavus, LOW)

Three material differences from comp-010 (A. flavus): 1. Codon burden HEAVY (C. utilis AT-biased, GC~42% vs. A. oryzae ~54%) — full gene synthesis optimization required (same as lactoferrin) 2. 4 free cysteines (vs. 0 in A. flavus) — ER aggregation risk during secretion; monitor by non-reducing SDS-PAGE 3. 2 internal KR sites (positions 130 and 138, both HIGH) vs. 1 in A. flavus — non-load-bearing in direct-secretion design

Follow-on to: comp-010 (A. flavus uricase Q00511 + lactoferrin, LOW risk)

Informs: validation-experiments.md §1.9 — Ward 1995 dual-cassette feasibility test

Interpretive wiki page: wiki/c-utilis-uricase-cassette-compatibility-computational.md

Accession note: P15296 (originally cited as likely C. utilis uricase) is a reassigned accession returning a Drosophila protein. Correct canonical entry: P78609 (URIC_CYBJA, Cyberlindnera jadinii) — verified by taxon search.


How to reproduce

cd experiments/comp-011-c-utilis-uricase-cassette-compatibility
python3 analyze.py

No external packages required (stdlib only: json, math, pathlib). Outputs land in outputs/.


File index

comp-011-c-utilis-uricase-cassette-compatibility/
  analyze.py                       ← analysis script (run this)
  inputs/
    P78609.fasta                   ← C. utilis uricase P78609 (UniProt, 303 aa; no signal peptide)
    P02788.fasta                   ← human lactoferrin (UniProt, 710 aa; signal peptide aa 1-19)
    glucoamylase_carrier.fasta     ← A. awamori glucoamylase P69327 (Ward 1995 / Huynh 2020 carrier)
    a_oryzae_codon_usage.json      ← A. oryzae RIB40 codon usage (RSCU + freq/1000; Kazusa database)
    kex2_site_specs.json           ← KEX2 cleavage rules (KR|X, P1' preferences, KRGGG linker)
    provenance.md                  ← sources, fetch dates, citations, accession note
  outputs/                         ← generated by analyze.py; committed as artifacts
    cassette_analysis.json         ← machine-readable full analysis
    summary.md                     ← human-readable; cited in the wiki
  README.md                        ← this file

Seven analyses

# Analysis C. utilis uricase verdict Lactoferrin verdict comp-010 delta
1 Codon usage (CAI proxy + hotspot scan) HEAVY (CAI proxy 0.65; AT-biased yeast vs. GC-biased host) LOW (proxy; full codon-opt required in practice) Material: A. flavus LOW (CAI ~1.51)
2 KEX2 site geometry HIGH (2 sites: pos 130 P1'=I, pos 138 P1'=S) — not load-bearing (direct-secretion cassette) MODERATE (1 moderate-risk site pos 579, P1'=K; 1 abolished pos 38, P1'=D) Delta: A. flavus 1 site (128 HIGH)
3 Secretion-targeting signals MODERATE (C-terminal TKL — UniProt-annotated microbody signal; weak PTS1 variant) LOW (no routing issues) Comparable (A. flavus SKL — same risk level)
4 Disulfide load VERY LOW (0 disulfides) + 4 free Cys — new aggregation risk MODERATE (17 disulfides, 1.06× Huynh 2020 baseline) Material: A. flavus 0 Cys (no free-Cys risk)
5 N-glycosylation sites 1 predicted (NSS at pos 54; not UniProt-annotated) 3 predicted, all UniProt-annotated (N137, N478, N623) Comparable
6 Combined secretion burden MODERATE comp-010 LOW
7 Comparison to comp-010 + Huynh 2020 HARDER on codon + free-Cys axes; ALLN-346 strategic advantage Identical to comp-010

Key results

Scope Verdict Primary driver
Overall cassette-design risk MODERATE Codon burden (HEAVY) + free-Cys aggregation risk
C. utilis uricase cassette MODERATE AT-biased codon preference; 4 free Cys
Lactoferrin cassette MODERATE (identical to comp-010) 1 moderate-risk internal KR site; 17 disulfides within Huynh precedent
comp-010 vs. comp-011 delta comp-010 LOW → comp-011 MODERATE Not a fundamental incompatibility; manageable with gene synthesis + SDS-PAGE QC
§1.9 cassette ordering recommendation Head-to-head: order BOTH A. flavus (comp-010 verified) AND C. utilis (codon-opt + ALLN-346 mutations) Empirical comparison resolves platform decision at $0 additional fermentation cost

Disagreement protocol

If you reproduce the outputs and disagree with the methods or numbers, file a GitHub issue referencing this folder (comp-011-c-utilis-uricase-cassette-compatibility). Primary candidates for revision: (1) C. utilis codon preferences — RSCU values used are Kazusa genome-wide approximations for related Candida/Cyberlindnera species; a species-specific table would improve precision; (2) KEX2 P1' classification (S. cerevisiae Kex2p-derived; A. oryzae kexB specificity unpublished); (3) CAI proxy burden threshold of 0.90 (introduced in comp-011 to capture the AT/GC mismatch that RSCU < 0.4 threshold misses); (4) free-Cys aggregation risk — magnitude is unknown without wet-lab data.