comp-006: DAF/CD55 Shio-Koji Protease Stability¶
Question: Would the DAF/CD55 soluble ectodomain (aa 35–353) survive the shio-koji protease environment if expressed in A. oryzae?
Verdict: HIGH / HIGH / HIGH
Three verdicts, all HIGH, but for structurally distinct reasons:
| Scope | Verdict | Max risk score | Worst protease |
|---|---|---|---|
| Full sequence (aa 1–381, incl. signal peptide + GPI propeptide) | HIGH | 0.388 | NPr |
| Mature protein (aa 35–381, excl. signal peptide) | HIGH | 0.388 | NPr |
| Soluble ectodomain (aa 35–353) | HIGH | 0.388 | NPr |
The key finding: Unlike comp-005 (lactoferrin), where HIGH was signal-peptide-contingent and dropped to MODERATE once the signal peptide was excluded, the CD55 ectodomain verdict remains HIGH even in the most favorable engineering scope (aa 35–353). The driver is the Ser/Thr-rich stalk (aa 286–353, pLDDT 30–52) — 68 residues of fully disordered polypeptide within the soluble ectodomain itself. NPr alone has 9 exposed stalk sites; ALP has 48. The SCR1–4 domains (aa 35–285, pLDDT 85–98) are well-folded and largely resistant.
Critical caveat: This verdict is stalk-contingent, and the stalk is removable. A construct truncated at SCR4 (aa 35–285) would eliminate all 9 NPr stalk sites and 48 ALP stalk sites. A comp-007 analysis of the SCR1-4-only construct is the logical follow-up before dismissing CD55 as an engineering candidate.
Informs: wiki/modality-chokepoint-matrix.md — "Engineered soluble complement regulators" row (CP0 platform gap)
Interpretive wiki page: wiki/daf-cd55-protease-stability-computational.md
Companion experiments: comp-001 (uricase, LOW), comp-005 (lactoferrin, HIGH/MODERATE)
How to reproduce¶
No external packages required (stdlib only: json, pathlib). Outputs land in outputs/.
The core algorithm lives in experiments/lib/protease_stability.py — shared with comp-001 and comp-005. This script is the orchestrator; the library exports functions only.
File index¶
comp-006-daf-cd55-shio-koji-protease-stability/
analyze.py ← analysis script (run this)
inputs/
P08174.fasta ← human DAF/CD55 sequence (UniProt, 381 aa)
alphafold_P08174_plddt.json ← per-residue pLDDT scores (AF-P08174-F1-v6)
protease_specificities.json ← koji protease rules + shio-koji conditions (from comp-005)
provenance.md ← sources, fetch dates, citations for every input
outputs/ ← generated by analyze.py; committed as artifacts
cleavage_sites.json ← machine-readable full results
summary.md ← human-readable; cited in the wiki
README.md ← this file
Key results¶
| Protease | Exposed sites (full sequence) | Exposed sites (ectodomain aa 35–353) | Stalk exposed sites (aa 286–353) | Max risk (ectodomain) |
|---|---|---|---|---|
| ALP (alkaline subtilisin) | 98 | 48 | 48 | 0.188 |
| NPr (neutral metalloprotease) | 39 | 9 | 9 | 0.388 |
| acid_protease (aspergillopepsin) | 17 | 1 | 1 | 0.195 |
All ectodomain exposed sites are in the Ser/Thr-rich stalk (aa 286–353, pLDDT 30–52). The SCR1–4 domains (aa 35–285) contribute zero exposed sites. If the stalk is truncated from the engineering construct, the ectodomain verdict would shift substantially toward LOW.
Structural interpretation¶
CD55 has two structurally distinct zones:
-
SCR1–4 domains (aa 35–285): pLDDT 85–98, well-folded, disulfide-stabilized (3 disulfides / domain). These carry the complement-regulatory activity. Structurally comparable to the uricase core in comp-001 — highly resistant.
-
Ser/Thr-rich stalk (aa 286–353): pLDDT 30–52, fully disordered. In the native membrane-anchored context, this is an O-glycosylated linker between SCR4 and the GPI anchor — it has no known enzymatic or binding function. In a heterologous soluble ectodomain, the stalk is present as an unstructured polypeptide. Without native O-glycosylation (or with different A. oryzae O-glycans), this region is a high-accessibility protease target.
Disagreement protocol¶
If you reproduce the outputs and disagree with the methods or numbers, file a GitHub issue referencing this folder (comp-006-daf-cd55-shio-koji-protease-stability). Primary candidates for revision: ectodomain boundary (UniProt P08174 SV=4 annotates soluble ectodomain aa 35–353; verify against current UniProt entry), stalk boundary (aa 286–353 based on pLDDT drop pattern), disulfide bond correction factor, or protease pH factors.