Literature review: Five computational developability guidelines for therapeutic antibody profiling

This work is an OPIG (Oxford Protein Informatics Group) project, like SAbDab.

The authors are from Oxford Univ and 4 pharma companies: MedImmune (subsidiary of AstraZeneca), Computational and Modelling Sciences department of GSK, Large Molecule Research department of Roche Pharma Research and Early development, and Chemistry department of UCB Pharma.

Background

Antibody developability issues except affinity to antigen:

  • Intrinsic immunogenicity
  • chemical and conformational instability
  • self-association
  • high viscosity
  • polyspecificity
  • poor expression

Structural properties related to developability:

  • High levels of hydrophobicity (esp. in CDR region)
    • Effect: aggregation, viscosity and polyspecificity
  • Asymmetry in the net charge of VL and VH domains
    • VH: variable domains in heavy chain; VL: variable domains in light chain
    • Effect (at high conc.): self-association and viscosity
  • Sequence motifs liable to post- or co-translational modifications
    • Effect: product heterogeneity, e.g. oxidation, isomerization, glycosylation

Article summary

In this article the authors collected post-phase I therapeutic antibodies, summarized sequence and structural properties and compared the properties with a background of human immunoglobulin genes. The rule based differentiating methods are then wrapped into a tool called Therapeutic Antibody Profiler (TAP).

Data

Positive set: 137 Clinical-Stage antibody Therapeutics (CSTs), variable domain heavy- and light- chain sequences. Background set 1: a snapshot (subset) of human antibody repertoire from the Observed Antibody Space database, the Ig-gene repertoire generated by NGS. The paired repertoire (VH/VL sequenced in single cell) were used to reconstruct structure of the complete antibody. Article ref. Online DB link. Background set 2: A larger proprietary dataset produced by UCB Pharma Ltd.

Modelling structures

Using ABodyBuilder to model the antibody structure from reference PDB templates. PDB structures of the exact antibody were not taken as reference templates but as validation to evaluate the backbone rmsd across the IMGT regions.

Another validation was on structural property calculation. Metric: Surface-exposed residue identification.

After method validated, authors performed structure modelling of all 3 datasets as the source for comparison.

Property comparison

Properties come from the reconstructed structure from homology modelling.

  • CDR length (loop length)
    • IMGT numbering (ref)
    • Minor difference in loop length distribution. Positive set have slightly shorter total CDR length.
  • Canonical Forms
    • Structure of non-H3 CDR loops can be classified into certain canonical forms, code name as “L1-14-C”, “H2-9-A”, …
    • Still some CST CDRs (<19%) cannot be assigned to canonical form
  • Hydrophobicity (the most sharp contrast in the results)
    • Estimation by residue’s degree of apolarity and solvent-exposure status.
    • Created a metric called PSH corresponding to hydrophobicity patches. Total PSH are sum of residue pairs’ PSH. PSH of a residue pair are normalized by squre of their heaviest atom distance. Thus high PSH value indicates aggregated hydrophobicity.
    • Salt bridge: residues forming salt bridge are treated as Glysine.
    • Observation: CSTs have lower PSH in CDR region.
    • Explanation: high concentration storage condition for therapeutic antibodies doesn’t favor large patches of hydrophobicity in the exposed CDR region.
  • Charge
    • Considering patches of positive / negative charges using similar metrics.
    • Salt bridge: residues forming salt bridge are treated as neutral.
    • Not very obvious differences, but authors claimed 3 charge related metrics in TAP.

Comments

Good practice in structural feature exploration, especially the metric-building techniques. Observer bias was not ruled out. Metrics values for CST and background have quite considerable overlapping. Certain differences, especially charge related, are not obvious. The proposed TAP could be a good-to-know but not necessarily strict rules to follow.

The work lacked clinically failed antibodies as negative set.

Final conclusion:

“not every human antibody would make a good therapeutic”

Notes

Abbreviations

CDR: complementarity-determining regions ASA: Accessible Surface Area

IMGT

IMGT: international ImMunoGeneTics information system, “the global reference in immunogenetics and immunoinformatics”.

  • Scope: Immunoglobulins (IG), T cell receptors (TCR), major histocompatibility (MH).
  • Phylogenic scope: human, vertebrate, invertebrates (for immunoglobulin superfamily and HM superfamily).
  • IMGT has its ontology system called “IMGT-ONTOLOGY”
  • IMGT Databases: sequence, genome, structure, monoclonal antibodies
  • IMGT web resources: including “sequence and 3D structure identification and description”, Numbering, Nomenclature, …
  • IMGT tools: sequence alignment, query, …
  • The IMGT numbering for coding regions is used for the description of mutations, allelic polymorphisms and structural data. ref

ABodyBuilder

ABodyBuilder: a tool in SAbPred project. ABodyBuilder is a pipeline integrating multiple tools (majority in OPIG) to model the antibody structure from templates (or ab initio if required). The inputs are sequences of heavy chain and light chain, outputs are PDB file, log, CDR sequence annotation and templates for selective download. The modelled structure can also be feed to downstream OPIG tools for further analysis. Article link

IgBLAST

IgBLAST is a BLAST-based tool developed by NCBI for immunoglobulin (IG) and T cell receptor (TCR) sequence analysis.

Salt bridge

From Wiki link:

“a salt bridge is a combination of two non-covalent interactions: hydrogen bonding and ionic bonding.” “The salt bridge most often arises from the anionic carboxylate (RCOO−) of either aspartic acid or glutamic acid and the cationic ammonium (RNH3+) from lysine or the guanidinium (RNHC(NH2)2+) of arginine. Although these are the most common, other residues with ionizable side chains such as histidine, tyrosine, and serine can also participate, depending on outside factors perturbing their pKa’s.”

In this work, salt bridges were defined as pairs of lysines/arginines and aspartic acids/glutamic acids with a N+ − O- distance <= 3.2 Å.

Written on August 10, 2021