Tutorial 01 — Ligand basics (donors, skeleton, atom roles)¶
This notebook introduces the core ligand abstraction in cTopo:
- you mark donor atoms
- cTopo derives the skeleton (donors + connectors)
- every atom gets a role label: DONOR / SKELETON / SUBSTITUENT
Prerequisites¶
You should have:
- a Python environment with RDKit
- cTopo installed (dev mode recommended):
pip install -e .
The docs build does not execute notebooks by default; run this notebook locally to see SVG depictions.
# Core imports (RDKit + cTopo)
try:
from rdkit import Chem
from rdkit.Chem import Draw
except ImportError as e:
raise ImportError(
'RDKit is required for these tutorials. '
'If you used conda-forge: `conda install -c conda-forge rdkit`.'
) from e
from IPython.display import SVG, display, Markdown
import networkx as nx
import ctopo
from ctopo import AtomType, ligand_from_smiles, ligand_from_mol
def show(v):
"""Display a cTopo visualization object (with .svg)."""
display(SVG(v.svg))
print(f'ctopo version: {ctopo.__version__}')
ctopo version: 0.1.0
1. Mark donor atoms in SMILES¶
In cTopo, donors are indicated by non-zero atom-map numbers in the SMILES:
[NH2:1]means “this nitrogen is a donor”- the actual map number value does not matter (only
!= 0)
After parsing, cTopo clears atom-map numbers internally and stores donor indices in the Ligand object.
lig_en = ligand_from_smiles('[NH2:1]C(C)C(C)[NH2:2]') # substituted ethylenediamine
print(f'denticity: {lig_en.denticity}')
print(f'donor_atoms: {sorted(lig_en.donor_atoms)}')
print(f'skeleton_atoms (excluding donors): {sorted(lig_en.skeleton_atoms)}')
print(f'substituent_atoms: {sorted(lig_en.substituent_atoms)}')
denticity: 2 donor_atoms: [0, 5] skeleton_atoms (excluding donors): [1, 3] substituent_atoms: [2, 4]
Visualize the ligand (donors and skeleton highlighted)¶
v = lig_en.visualize_ligand()
show(v)
v.smiles
'CC(C(C)[NH2:1])[NH2:1]'
2. Skeleton vs substituents¶
The skeleton is defined as the union of atoms on shortest paths between every donor pair. Everything that is not donor or skeleton is treated as a substituent.
This makes the “coordination-relevant core” explicit, while leaving peripheral decoration as substituents.
# Compare two bidentate amines with different linker lengths.
lig_en2 = ligand_from_smiles('[NH2:1]CCC[NH2:2]') # 3-carbon linker
show(lig_en.visualize_skeleton())
print(f'en skeleton size: {len(lig_en.skeleton_atoms) + len(lig_en.donor_atoms)}\n')
show(lig_en2.visualize_skeleton())
print(f'propyl diamine skeleton size: {len(lig_en2.skeleton_atoms) + len(lig_en2.donor_atoms)}')
en skeleton size: 4
propyl diamine skeleton size: 5
3. Atom roles on the graph¶
Internally, cTopo converts an RDKit molecule to a NetworkX graph G.
Each node corresponds to an atom index and carries a rich set of attributes (atomic number, ring flag, etc.).
cTopo assigns an AtomType to each atom:
DONORSKELETONSUBSTITUENT
(For complexes there is also CENTER / METAL.)
from ctopo import AtomType
AtomType.__members__
mappingproxy({'UNDEFINED': <AtomType.UNDEFINED: -1>,
'SUBSTITUENT': <AtomType.SUBSTITUENT: 0>,
'SKELETON': <AtomType.SKELETON: 1>,
'DONOR': <AtomType.DONOR: 2>,
'CENTER': <AtomType.CENTER: 3>,
'METAL': <AtomType.CENTER: 3>})
# get int -> str mapping for AtomType
mapping = {int(at): str(at).split('.')[-1] for at in AtomType}
# Count atom types
G = lig_en.G
counts = {}
for n, data in G.nodes(data=True):
at = mapping[data.get('atom_type', None)]
counts[at] = counts.get(at, 0) + 1
counts
{'DONOR': 2, 'SKELETON': 2, 'SUBSTITUENT': 2}
4. A quick tridentate example¶
Many multidentate ligands share a small number of topologies at a given denticity. Here is a simple “tripod-like” tridentate amine (3 terminal donors, central non-donor connector):
lig_tren_like = ligand_from_smiles('N(CC[NH2:1])(CC[NH2:2])CC[NH2:3]')
print(f'Ligand (denticity = {lig_tren_like.denticity}):')
show(lig_tren_like.visualize_ligand())
print('\nSkeleton:')
show(lig_tren_like.visualize_skeleton())
print('\nTopology:')
show(lig_tren_like.visualize_topology())
Ligand (denticity = 3):
Skeleton:
Topology:
Takeaways¶
- Donor atoms are marked via atom-map numbers in SMILES.
- Skeleton is defined by shortest donor-to-donor paths.
- Atom roles give a chemically meaningful partition of the molecule.
Next: Tutorial 02 dives deeper into topology reduction rules and how to use topology/skeleton keys for dataset analysis.