{ "cells": [ { "cell_type": "markdown", "id": "cb74ce1f-c7c9-4916-a1b7-baab8ae661b0", "metadata": {}, "source": [ "# Compound properties" ] }, { "cell_type": "markdown", "id": "9ffa0885-f936-47fb-a8a1-0acf229c6919", "metadata": {}, "source": [ "## Initialization\n", "\n", "NIST Chemistry WebBook compound can be initialized via NIST Compound ID, CAS Registry Number, or InChI string:" ] }, { "cell_type": "code", "execution_count": 1, "id": "dbf6b78d-dc00-4ec2-ab0d-6008e3912258", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "NistCompound(ID=C632053)" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import nistchempy as nist\n", "\n", "X = nist.get_compound('C632053')\n", "X" ] }, { "cell_type": "code", "execution_count": 2, "id": "bea0e054-e0ec-4958-a3ea-6b8a344fb8db", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "NistCompound(ID=C632053)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = nist.get_compound('632-05-3')\n", "X" ] }, { "cell_type": "code", "execution_count": 3, "id": "f7a4c154-96d1-4749-9b15-288856ca9cfd", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "NistCompound(ID=C632053)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = nist.get_compound('InChI=1S/C4H7Br3/c1-3(6)4(7)2-5/h3-4H,2H2,1H3')\n", "X" ] }, { "cell_type": "markdown", "id": "078b752f-6f62-4920-b18b-5aec58022c2a", "metadata": {}, "source": [ "If there are no compound with given identifier in the NIST Chemistry WebBook database, `nist.get_compound` will return `None`. The same result will occur if multiple substances correspond to the given InChI." ] }, { "cell_type": "markdown", "id": "9d715d52-7687-4bd6-938d-ca353b659f48", "metadata": {}, "source": [ "The other way of compound initialization is to run the search and load found compounds (see the **Basic Search** section of the CookBook)." ] }, { "cell_type": "markdown", "id": "6b65ad21-fefd-4ce2-9097-fbf3d1c1bf06", "metadata": {}, "source": [ "## Properties\n", "\n", "The `nist.compound.NistCompound` object contains information extracted from the NIST Chemistry WebBook's compound web page. It can be divided into three groups:\n", "\n", " - **Basic properties** — properties which are already extracted from the compound web page:\n", "\n", " - `ID`: NIST Compound ID;\n", "\n", " - `name`: chemical name;\n", "\n", " - `synonyms`: synonyms;\n", "\n", " - `formula`: chemical formula;\n", "\n", " - `mol_weight`: molecular weigth;\n", "\n", " - `inchi` / `inchi_key`: InChI / InChIKey strings;\n", "\n", " - `cas_rn`: CAS Registry Number.\n", "\n", " - **Reference properties** — dictionaries {property name => URL}. There are four subgroups:\n", "\n", " - `mol_refs`: molecular properties, which are 2D and 3D MOL-files;\n", "\n", " - `data_refs`: WebBook properties, which are stored in NIST Chemistry WebBook;\n", "\n", " - `nist_public_refs`: other properties, which are stored in public NIST websites;\n", "\n", " - `nist_subscription_refs`: other properties, which are stored in paid NIST websites.\n", "\n", " - **Extracted properties** — properties extracted from the URLs provided by **reference properties**:\n", "\n", " - `mol2D` / `mol3D`: text blocks of 2D / 3D MOL-files;\n", "\n", " - `ir_specs` / `thz_specs` / `ms_specs` / `uv_specs`: JDX-formatted text blocks of IR / THz / MS / UV spectra;\n", " \n", " - `gas_chromat`: gas chromatography data.\n" ] }, { "cell_type": "code", "execution_count": 4, "id": "184aa682-2382-4111-a251-6e516d9f2afa", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'_request_config': RequestConfig(delay=0.0, kwargs={}),\n", " '_nist_response': NistResponse(ok=True, content_type='text/html; charset=UTF-8'),\n", " 'ID': 'C120127',\n", " 'name': 'Anthracene',\n", " 'synonyms': ['Anthracin',\n", " 'Green Oil',\n", " 'Paranaphthalene',\n", " 'Tetra Olive N2G',\n", " 'Anthracene oil',\n", " 'p-Naphthalene',\n", " 'Anthracen',\n", " 'Coal tar pitch volatiles:anthracene',\n", " 'Sterilite hop defoliant'],\n", " 'formula': 'C14H10',\n", " 'mol_weight': 178.2292,\n", " 'inchi': 'InChI=1S/C14H10/c1-2-6-12-10-14-8-4-3-7-13(14)9-11(12)5-1/h1-10H',\n", " 'inchi_key': 'MWPLVEDNUUSJAV-UHFFFAOYSA-N',\n", " 'cas_rn': '120-12-7',\n", " 'mol_refs': {'mol2D': 'https://webbook.nist.gov/cgi/cbook.cgi?Str2File=C120127',\n", " 'mol3D': 'https://webbook.nist.gov/cgi/cbook.cgi?Str3File=C120127'},\n", " 'data_refs': {'cTG': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=1#Thermo-Gas',\n", " 'cTC': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=2#Thermo-Condensed',\n", " 'cTP': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=4#Thermo-Phase',\n", " 'cTR': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=8#Thermo-React',\n", " 'cSO': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=10#Solubility',\n", " 'cIE': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=20#Ion-Energetics',\n", " 'cIC': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=40#Ion-Cluster',\n", " 'cIR': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=80#IR-Spec',\n", " 'cMS': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=200#Mass-Spec',\n", " 'cUV': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=400#UV-Vis-Spec',\n", " 'cGC': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C120127&Units=SI&Mask=2000#Gas-Chrom'},\n", " 'nist_public_refs': {'Gas Phase Kinetics Database': 'https://kinetics.nist.gov/kinetics/rpSearch?cas=120127',\n", " 'X-ray Photoelectron Spectroscopy Database, version 5.0': 'https://srdata.nist.gov/xps/SpectralByCompdDd/21197',\n", " 'NIST Polycyclic Aromatic Hydrocarbon Structure Index': 'https://pah.nist.gov/?q=pah015'},\n", " 'nist_subscription_refs': {'NIST / TRC Web Thermo Tables, \"lite\" edition (thermophysical and thermochemical data)': 'https://wtt-lite.nist.gov/wtt-lite/index.html?cmp=anthracene',\n", " 'NIST / TRC Web Thermo Tables, professional edition (thermophysical and thermochemical data)': 'https://wtt-pro.nist.gov/wtt-pro/index.html?cmp=anthracene'},\n", " 'mol2D': None,\n", " 'mol3D': None,\n", " 'ir_specs': [],\n", " 'thz_specs': [],\n", " 'ms_specs': [],\n", " 'uv_specs': [],\n", " 'gas_chromat': []}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# anthracene example\n", "s = nist.run_search('anthracene', 'name')\n", "X = s.compounds[0]\n", "X.__dict__" ] }, { "cell_type": "markdown", "id": "fc7d0742-ae64-422a-999e-e7f1da5f2330", "metadata": {}, "source": [ "## MOL-files\n", "\n", "To load MOL-files, one can use `get_mol2D`, `get_mol3D`, or `get_molfiles` methods:" ] }, { "cell_type": "code", "execution_count": 5, "id": "369bc56f-fb93-4a61-90be-7bda1e432376", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(False, False)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X.get_molfiles()\n", "X.mol2D is None, X.mol3D is None" ] }, { "cell_type": "code", "execution_count": 6, "id": "b80185db-4e89-4d9e-8d59-f62b7579b8c1", "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAIAAADCEh9HAAAABmJLR0QA/wD/AP+gvaeTAAAWrklEQVR4nO3db0xTVxsA8EsBmRVfCoj8E6KiFhb+uAh1UxOisIDAzL6wxMRGsyzdErXMOALRJWz7sHU6Z0dIJlvmrPvG9kHK4gQc+CdKAUsciqydThYBwdGpjIBQ2vt+OHvv24Fi7z23955z+/w+LEvswcdr+5w/z3lqCMuyDAAAAKFUcgcAAAB0gzQKAABYII0CAAAWSKMAAIAF0igAAGAJkzsAulmt1tHR0ZKSkuTkZLljUY7e3l6bzabVavPz8+WOReGGhobOnj0bHx+/Y8cOuWOhGQsEcTqdmZmZ6BmqVKo9e/bMzs7KHRT1Zmdnd+/eHRISgh5sdna20+mUOyhl8ng8e/fuDQ0NRY967dq1/f39cgdFK0ijvLndbrPZrFL9cx6yaNEi9D/Jycnd3d1yR0ex3t5eblEfERHBTVFHjx6dnp6WOzpF6e7u5h419wZWqVRms9ntdssdHX0gjfJz4cKF7Oxs9LaLjo4+c+aMx+M5cOBAWFgYwzAhISF6vX5kZETuMCkzOTlZU1ODPs9hYWEGg4FlWavVunz5cvSoU1NTLRaL1+uVO1LquVwuo9GIFqGhoaG7du1yu90tLS1xcXHoUaenp587d07uMCkDadRfg4ODer0e7TdTUlJOnTrl+6vj4+M1NTVoDaXRaEwmEyyg/NTW1rZu3Tq0GjIYDA8fPvT91ebmZm7e0ul0ly5dkitO2rnd7vr6+mXLljEMEx4ebjQa5zzqxsbG1atXo0ddVlZ29+5dmSKlD6TR55uenjabzUuXLmUYRq1W19TUTE1NPfWVTqeztLQUvRG1Wi3M6gu7f/++Xq/njkE7Ojqe+jKPx2OxWBISErhP+O3btyUOlXa+u6iCgoKbN28+9WX+v9WBL0ijz2G1WvlO0a2trenp6TCrL8Dr9VosltjYWO7j+tzF+8TEhMlkQp/w8PBwg8Hw4MEDaaKlmu8uKi0traGhgdeQlJQUi8UiQZxUgzT6TDhLy5mZGW5WX7x4Mczqvnp7e1955RX0YEtLS3lNM4ODgwaDAR3tRUdHm0ymJ0+eBCxSuqGlZWRkpLClpZ8LWMBCGn2qiYkJUQ46h4aGuFl9xYoVMKv7lpISExMFP5C+vr6SkhKoPi1gzi5qYGBAwA+Zf5z6+PFj0UNVAEij/+L1ehsaGlJSUriy++joKObP9J3Vt23bFrSz+pxSEv4HsrW1FapP8zmdTm6OSU9Pb25uxvyBvsX9xMTE+vp6j8cjSqiKAWn0/3p6ejZv3ozef7m5uc+qeAiAiiS+s/qjR4/E+uHk87OUJABUn3zN2UWJewnUbrcH6NOhAJBGWfbf821CQkKA5ltpfheiCCglCQDVJ99dlEqlEmUXJdfvQqNgT6Pz14l8N5vd3d12u93/1/f09GzZsoWb1a9evcozZGr09vZu2rRJWCmJZdlvv/12YmLC/9cHbfXJd52Yl5fHd504MzNjsVj8b2UO6JqXUkGdRvFPLWdnZ3NycvjOzGhWT01NZRTa+IRfSjp79izDMElJSfX19by+rCCoqk+inFoeP34cnaK2tLT4P0r0E1iqBWka9a2h49yMm5ycPHjwYHh4OMMwMTExdXV1/s/MYt0HIE1bW5tWq8UsJXV1dW3cuBF9SrOysvg2Mii++iRiDb2pqWnlypVoRt+5c+fg4KD/Y0W5D6AAQZdGA3GjE2dmVlLj08jIiLilJKvVmpaWhn5gYWHh9evX/R+r4OqT6Dc6JycnTSaTsBummLdTlSG40qjvx1L0/iKcmZn2xqfAlZJmZma4ZRc6PBkeHvZ/uMKqTwJakvj+cPQm5PvDg7zxKVjSqDSLPpyZmd7GJ8xSkj9cLldVVRU6AFGr1VVVVbz2sAqoPkm26Gtvb8/KyuJ2AH19ff6PDdrGJ+WnUemPIHGWDHQ1PonVleSngYEB7uEEVfVJ4iNIdPCK9hZ8D16Ds/FJyWk0EC1J/sOZmS9evEh+4xNXSgoJCRGlK8lPNpuNuzGWkZHR1NTEazhd1ScZC+JjY2O+1wB4zTrB1vik2DQ653qmLE0XODMzyY1PopeS+EITJLdAU2T1iZDrmXa7nTuxycvLs9lsvMbiXGiliALTKGnNQjgzs+/Y2NhYs9ks759Fmq4kP6HjQo1Go7DqE3rIKMuT0Cw0Px7/H1SQND4pKo2SvILDmZkJaXySoJQkwPzq0/j4uP/DSas+EbuCQ6tjdA4eHR1tNpuh8YmjnDRK/hcp4awy5G188i0lJSQkEFj4cjqd5eXlVFefqDhPdDgc27dvRw8qJyeH18myghuflJBGxWpJkgbOzCxL45NcpSQBbDYbt5SjqPpEXXXbarWuWrVK2M0BRTY+0Z1G6b1rSUXjk+ylJAGoqz61t7fTeNcSGp98UZxGae/8YQlufCKqlCQAFdWngLYkSePevXvcRLtmzRrBjU+U/vE5VKZRJfWhE9j4RGYpSQBiq08KW461tbUFeeMTZWlUqd+KREjjE/mlJAFIqz4p8nAwyBufqEmj81uSFPYdnazcjU++pSS9Xu9yufj+BJKRUH1ScKkaCdrGJzrSKCEXJyUgS+MTjaUkAWSsPin+4qSvOY1PnZ2dvMaSeW12YaSnUdJakqQhWeMT7aUkASSuPpHWkiQN9KeOj49ngqPxidw0SnJLkjQC3fjkcDi2bt2KXlNSUkJvKUkAaapPlK6txBI8jU+EptG6ujqu9kdmS5I0cGbmBRqfFFlKEiBw1SeqT/rE5dv4tH79esGNT1qt9tixY4GLEwdxabSnp4c7vSL/CzelIW7jU0tLi4JLSQKIW31qb2+nve4cCGI1PsXHx1+5ciVwcQpDVhq12+0qlQo9r6KiIqov04kOp87rcDiKi4vRWLT4ysnJ4fWlZ8qGWX1yu90nTpzgjgLRDykuLnY4HIGLmTo4jU9TU1NFRUXowapUKl7/pLkEyEqj+fn5DMMsXbo02E6R/Idz67CxsVGj0URGRlZXV5N80iQXzOrT+Pj43r17lyxZotFoGhsbAxcn1XAanzo6OlBxLz8/P2ABCkFWGtXpdKjcIXcgRMPpgUHb+f7+/oBGSDWc6lN/fz86xQtohAoguPEJbch0Ol1Aw+Prnw0IoMiiRYsqKip+/fVXvV4/NTX14YcfZmZmfv/993LHpRAxMTEmk+nGjRvl5eVTU1Offvppenr6V1995fF45A5NObZu3drT02M2m6Oios6fP79+/fqKiorx8XG54xKIrDS6bt067r9gYcnJyadPn0bfD3Tnzp033ngDzepyx6UQa9eubWho6Ojo2Lx58/Dw8Ntvv52VlfXjjz/KHZdyhIWFVVRU3Llzx2g0er3e2trajIyM06dPsyy7wCgyUwRZaTQxMZH7L/BHfn6+3W5HpeGff/75pZdeonpWJ83GjRsvX76Mqk/9/f2vvfbaq6+++ssvv8gdl3LExsZ+8cUXnZ2dmzZtGh4e3r1798svv9zV1fWs15OZIshKowubnp4eGxubmJiQOxCyhIWFGQyGW7duvfXWWx6Pp7a21m63yx2UcoSEhJSXl9+6devIkSMajeb8+fPbtm2bnJyUOy5F2bBhw+XLl7/55pv4+Piurq66ujq5I+KHpjT63XffxcXFvfvuu3IHQqK4uLivv/66s7OzpqaG600CYomIiKisrLx9+7bRaDx8+LBarZY7IqVRqVRvvvmmw+F47733TCaT3OHwEyZ3AEBMubm5ubm5ckehWGgHKncUShYVFXX06FG5o+CNptUoAAAQCNIoAABggTQKAABYII0CAAAWSKMAAIAF0igAAGCBNAoAAFggjQIAABZIowAAgAXSKAAAYIE0CgAAWCCNAgAAFkijAACABdIoAABggTQKAABYII0CAAAWSKMAAIAF0igAAGCBNAoAAFggjQIAABZIowAAgAXSKAAAYIE0CgAAWCCNAgAAFkijAACABdIoAABggTQKAABYII0CAAAWSKMAAIAF0igAAGCBNAoAAFggjQIAABZIowAAgAXSKAAAYIE0CgAAWCCNAgAAFkijAACABdIoAABggTQKAABYII0CAAAWSKMAAIAF0igAAGCBNAoAAFggjQIAABZIowAAgAXSKAAAYIE0CgAAWCCNAgAAFkijAACABdIoAABggTQKAABYII0qyrVr1z744AO5o1Asl8tVUVHx+eefyx2IYj1+/LiysnJ4eFjuQPgJkzsAHvR6/euvv/7CCy/IHQiJ/vzzz0OHDp08edLr9ebn52/dulXuiBRlenq6trb2448/fvToUUxMzDvvvKNWq+UOSlG8Xu+pU6cOHTo0Ojo6Ojp6+vRpuSPigaY0GhERERERIXcUxJmdnT158uThw4fHxsbCw8P37du3YcMGuYNSDpZlf/jhh+rq6t9//51hmMLCws8++wxyqLjsdvv+/fs7OjoYhtHpdPv27ZM7In7ISqP379/n/gv8ceHCBaPReOPGDYZhCgoKamtrX3zxRbmDUo7Ozs6DBw9euXKFYZiMjIwjR46UlZXJHZSiuFyujz76qK6uzuv1JiUlffLJJ3q9PiQk5FmvJzRFsCTR6XQMw5SUlMgdCAUGBwe5N1xaWlpDQ4M/o7RaLcMw/f39gQ6Pdk6ns7y8HD3epKSk+vr62dnZ547q7+9nGEar1UoQIe3cbrfZbI6KimIYJjw83Gg0jo+PP3dUSUkJwzA6nU6CCP1H1moU+GNmZubLL798//33JyYm1Gp1ZWVldXU1HBmL5a+//jpy5IjZbJ6enlar1fv37z98+PDSpUvljktR2tvbjUbjzZs3GYYpLCysra3NyMiQOygMcufxf8nPz2cYZsmSJVeuXJE7FkJZrdbVq1ejv7uysrKBgQFeYzUaTWRkZHV1tdvtDlyQlJqenjabzRqNhmEYlUql1+uHh4f9Hz4xMVFZWRkZGRkVFWWxWAIXJ9Xu3bun1+vRG3jNmjVNTU3+j+3s7PzPf/7DMEx+fn7AAhSCrDRqt9tVqn/uYBUVFU1NTckdEUEcDgfa0TAMk56e3tzczGtscXExGos2qjk5OTabLXDR0sXr9TY0NHDzU2Fh4fXr1/0f7na7T5w4ER8fj/Iv+iHFxcUOhyNwMVNncnLSZDJFRkYyDKNWq2tqavz/gE9NTRUVFaEHq1Kp7HZ7QEPli6w0yrJsT08P925esWIFzOosy05MTNTU1KBbCtHR0Waz2f+1pO9YjUZjMplaW1vRCWlISIher3e5XAENnnw2m23z5s3oLZeRkcFrfcSybGtra3Z2Nhqu0+na29vr6+uXLVvG/O/I7/HjxwGKnCJWq3XVqlXcLuqPP/7gNZbLCfHx8QRuVYlLo0hdXR331ty2bduNGzfkjkgeXq/XYrEkJCRw28zR0VH/xzY0NKSkpHAZkxs7OTlZU1OzaNEihmESEhKCdq4SVkfi9PX1cfuD1NRUi8Xi9XrRL7lcLqPRGBoayjBMYmJifX29x+MJzB+CdL47ofXr11+6dInXWO4Ja7XaY8eOBS5OHISmUZZlPR6PxWKJi4vjZvVHjx7JHZSk7Hb7pk2b0HsoLy+P1x68p6dny5YtaGxubm5HR8f81zgcDu6WfklJyd27d0ULnXgul6uqqgot0tVqdVVVlT9lYs7g4KDBYEBZMjo62mQyPXnyZP7L7HY7t87Ny8t76t+Cgv3999/cbI12Uf7PUjg7MOmRm0YR31k9NjbWbDYHw6yOs5bxHZuQkLDwWLTajY2NZRhm8eLFNTU109PTIv0hCIVfRzKZTKhwHx4ebjAYHjx4sMDrffcEfPcT9ELvK+6wWK/XL/yU5o8VtgOTC+lpFJmztrp69arcEQWK2+0WfLKG1u++Y/1cv4+MjHDF0+zsbKUumjDrSOjxoo83OuC7ffu2n2PnnFATvrbCdO3aNW4XpdPpOjs7/R87Z/1OSxWUjjTK/u8zkJqayp30jYyMyB2UyNrb27OystB7qKCgoK+vz/+xFy5c8D1NvnnzpoDfXcGlJ3HrSLwO+DhOp1PwXQsqjI2NcTuhpKQk38Pi56L6NJmaNIrMrzsrYxMqrCUJGRoa4sampKTg1IsUWXoKXB1JGJybv8Sa35Lk/y4KZwdGCMrSKPLbb7+VlpZy9btz587JHZFw6KhO2GU6NBad06GTTVFu2iqm9CRNHUkAnL90ArW1tWVmZnJHJbx2UTg7MHJQmUaR1tZWroGsrKyMxk87ZkuS71hx//i0l54kriMJ47sFwdxGyGVOSxKvXRTODow0FKdRlmVnZmYCsRyTAE5LktPplGYxTmPpScY6kjC+h9oFBQUCDrVlgdOSpLDFOEt7GkV8DwfJb3wStyVJgkUiRaUnEupIAlB3OChWS5JijoaVkEaRixcvEt74FKCWJAmQX3oirY4kABWlaofDsX37dvSgcFqSFHZRQTlplCW78QmzJYlbZD2rJUkCZJaeiK0jCeN7cVLGv+v50E4oGFqSBFBUGkVIa3ySrCVJAkSVnqioIwlAWuNTsLUkCaDANIqQ0PgkbksSOedlspeeqKsjCUBI45NvS1JeXl4wtCQJoNg0ysrd+CRvS5IE2traZCk92Ww2boKkqI4kjIyNT0HbkiSAktMoIn11G+dCHF13CSUuPQ0MDHAPh9I6kjABvSM8X5C3JAmg/DSKSHPXksCWJAlIUHqaX0fi9eEkrY4kgGR3Ldva2rhdFGZLEpm7qEAIljSKBLTxidiWJAn4lp7Qh1ysJf/MzAy3ulFSHUmYgG5WoCVJsOBKo2xgGp+oaEmSgOilJ6vVmpaWpuA6kjCiNz5BSxKmoEujiFiNT5OTkwcPHgwPD2cYJiYmpq6ujvCWJAmIUnrq6urauHEjyhRZWVl8Zxe66kgCiHgE2dTUtHLlSvT3tXPnzsHBQf/HKrIlSYAgTaPInMYnAbP67OxsTk4OXS1JEsAvPf3000/BVkcSQJSC+PHjxxmGycnJ4TXZKP67U3kJ6jTKYnxjPKe7u5vXP/dKSEuSBHp7e7krh6WlpXwPfL/99tuJiQn/X6+AOpIwmI1PMzMzFotFWEuS4r/J30/BnkYRaRqfSGtJkkDgSk++FFZHEkCaxqcgaUkSANLo/wWu8YnkliQJBK7rScF1JAECuk4M8n/ldGGQRv8lEI1PVLQkScC39GQwGPAnEsXXkYSZc2qJf/cj2FqSBIA0+hRi1dDpakmSgFhdT0FVRxJGlJvIwdmSJACk0WfCudHp25IUtJfpngWn9BS0dSQBMN+ElH4tvywgjT5Ha2treno6r1md9pYkCQgoPUEdSRgBW6Igb0kSANLo8/nf+KSkliQJ+Fl6gjoSPj+XltCSJAykUX/NaXw6deqU76+Oj48rsiVJAnNKTw8fPvT91ebmZqgjiWL+QeecR93Y2AgtScJAGuXHd1aPjo4+c+aMx+M5cOBAWFgYo9CWJAn4lp7CwsIMBgPLslardfny5VBHEpdv2T00NHTXrl1ut7ulpQX90zsMtCQJAmmUN/RtjCqVCr3t0IefYZjk5OTu7m65o6NYb29vcnIyephoXc8wjEqlOnr0KCztxdXd3c09au4NrFKpoCVJGEijAjmdzszMTO79t2fPHl6t3+CpZmdnd+/ejU5O0IGp0+mUOyhl8ng8e/fuRctShmHWrl3b398vd1C0CmFZlgFCWa3W0dHRkpISbm4H+Hp7e202m1arzc/PlzsWhRsaGjp79mx8fPyOHTvkjoVikEYBAACLSu4AAACAbpBGAQAAC6RRAADAAmkUAACwQBoFAAAs/wUx08zANz9xbQAAAOJ6VFh0cmRraXRQS0wgcmRraXQgMjAyMy4wOS41AAB4nHu/b+09BiDgZ0AAPiAWAOIGRjaHDCDNzIyfoQFisCBoclSgM7gZGTKYGJgSmBkTmFgzmFiYgZg1gZ0lgZUtgY0zg4mDHYg5E7g5Eji5Erh4M5h4uDOYeHkSRJiB+tkYmFhZgIaxcXKws7CycfHycHNwim8CyjAi+5FB8+1ee5jA0X9MYLbZ6jj77pz/cDZMvnbdPXuYGhAbptdgt7IDTD2IDVN/a02cA0w9iA1Tv1SnG64exIap52FaDlcPYsPUiwEAjUNAAxurvUYAAAFWelRYdE1PTCByZGtpdCAyMDIzLjA5LjUAAHichZPbasMwDIbv8xR6gGEsW47t3eVQRhlNYOt2X0Zhu+nF6PszKSOxOornHLCVz85v/XJ3uX5+nz7Ol/MD7MdHGNBZdLEBaS/j89cVtubGhuO2cuec4d1ba5sDSAf63dN+guHY9WtkmN+m4ysgAbY8h69btjvOhzWCMIA1dmmAhpJP3Nki20ynOWsCYrzL+YVLbU7LejmHcJcjzamF/3KBOTTR51j/b6u5yj4ic86EkHxdX9JcRV9mzhuK2df1odVgRSCKI2Q8ZVdXiE6DFYkongTjMP/jHZIGKxp303hTRL9l1c/TWMqKuGSwVA/XIvhSJCTfSy0Qh9piOclTnJVhWwwkNicUn4iJWOwgJlCnnTiYVHZljDqLJK+sskUyDVGlhWQibpJQJDKI2xZQVMqB8zpHOiMyXg8p95sfe+rD7uaS1yIAAAB/elRYdFNNSUxFUyByZGtpdCAyMDIzLjA5LjUAAHicPY5RDsAgCEOvss8tIQREJsTjcA0PP40Ovtq8hjY4IkqExDpZmq9xExAqc4NOwFhNbCq0121ad1XojE28JS2oapJUsDaXpBWlekmqWNgTbvcXnijljx+cBsrqnLiHETzjA99yLCQAxHnlAAAAAElFTkSuQmCC", "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from rdkit import Chem\n", "\n", "mol = Chem.MolFromMolBlock(X.mol2D)\n", "mol" ] }, { "cell_type": "markdown", "id": "9e5fc177-d6d4-4f3d-abab-463ece22baff", "metadata": {}, "source": [ "## Spectra\n", "\n", "To load spectra, one can use `get_ir_spectra`, `get_thz_spectra`, `get_ms_spectra`, `get_uv_spectra`, and `get_all_spectra` methods:" ] }, { "cell_type": "code", "execution_count": 7, "id": "1144e47e-a659-49a3-81e1-6876706eaf65", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "([], [], [], [])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X.ir_specs, X.thz_specs, X.ms_specs, X.uv_specs" ] }, { "cell_type": "code", "execution_count": 8, "id": "3960b362-89cd-4153-a658-24aaf4e50ee5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "([], [], [Spectrum(C120127, Mass spectrum #0)], [])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X.get_ms_spectra()\n", "X.ir_specs, X.thz_specs, X.ms_specs, X.uv_specs" ] }, { "cell_type": "markdown", "id": "58cc075d-5efc-4558-8b88-e1f159a4bb74", "metadata": {}, "source": [ "Spectrum object contains JDX-formatted text block of the spectrum which includes both meta-information and spectral data:" ] }, { "cell_type": "code", "execution_count": 9, "id": "33060848-c874-448a-aa12-47b8c5c4cdd4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "##TITLE=Anthracene\n", "##JCAMP-DX=4.24\n", "##DATA TYPE=MASS SPECTRUM\n", "##ORIGIN=Japan AIST/NIMC Database- Spectrum MS-NW- 132\n", "##OWNER=NIST Mass Spectrometry Data Center\n", "Collection (C) 2014 copyright by the U.S. Secretary of Commerce\n", "on behalf of the United States of America. All rights reserved.\n", "##CAS REGISTRY NO=120-12-7\n", "##$NIST MASS SPEC NO=228201\n", "##MOLFORM=C14 H10\n", "##MW=178\n", "##$NIST SOURCE=MSDC\n", "##XUNITS=M/Z\n", "##YUNITS=RELATIVE INTENSITY\n", "##XFACTOR=1\n", "##YFACTOR=1\n", "##FIRSTX=27\n", "##LASTX=181\n", "##FIRSTY=20\n", "##MAXX=181\n", "##MINX=27\n", "##MAXY=9999\n", "##MINY=10\n", "##NPOINTS=62\n", "##PEAK TABLE=(XY..XY)\n", "27,20 28,10 38,30 39,109\n", "50,129 51,129 52,30 61,40\n", "62,129 63,289 64,20 65,20\n", "69,20 73,10 74,219 75,299\n", "76,619 77,80 78,10 83,50\n", "85,30 86,99 87,169 88,439\n", "89,759 90,10 98,119 99,90\n", "100,50 101,50 102,60 110,40\n", "111,50 113,60 114,20 115,50\n", "122,40 123,20 124,20 125,50\n", "126,149 127,60 128,80 137,30\n", "138,30 139,209 140,80 149,70\n", "150,419 151,629 152,689 153,80\n", "163,50 164,20 174,129 175,199\n", "176,1409 177,799 178,9999 179,1569\n", "180,149 181,30\n", "##END=\n", "\n" ] } ], "source": [ "ms = X.ms_specs[0]\n", "print(ms.jdx_text)" ] }, { "cell_type": "markdown", "id": "f294f4e8-ada0-4a96-a3eb-16f7bbc7cfc5", "metadata": {}, "source": [ "Spectra of the given compound can be saved to the given directory via `save_ir_spectra`, `save_ir_spectra`, `save_ir_spectra`, `save_ir_spectra`, `save_all_spectra` methods of the `nist.compound.NistCompound` object. To save the specific spectrum, one can use the `save` method of the `nist.compound.Spectrum` object." ] }, { "cell_type": "markdown", "id": "bf5a1c68-ff84-4ad3-8d2d-625442969666", "metadata": {}, "source": [ "## Gas Chromatography\n", "\n", "To load gas chromatography data, use the `get_gas_chromatography` method:" ] }, { "cell_type": "code", "execution_count": 10, "id": "0459fd5a-6193-4467-9bdf-3f90c50c64d4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X.gas_chromat" ] }, { "cell_type": "code", "execution_count": 11, "id": "6347a920-b320-41db-b56d-76166fdd1950", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[Chromatogram(C120127, Kovats' RI, non-polar column, isothermal: 5 data points),\n", " Chromatogram(C120127, Kovats' RI, polar column, isothermal: 1 data points),\n", " Chromatogram(C120127, Van Den Dool and Kratz RI, non-polar column, temperature ramp: 41 data points),\n", " Chromatogram(C120127, Van Den Dool and Kratz RI, non-polar column, custom temperature program: 7 data points),\n", " Chromatogram(C120127, Normal alkane RI, non-polar column, isothermal: 2 data points),\n", " Chromatogram(C120127, Normal alkane RI, non-polar column, temperature ramp: 20 data points),\n", " Chromatogram(C120127, Normal alkane RI, non-polar column, custom temperature program: 12 data points),\n", " Chromatogram(C120127, Normal alkane RI, polar column, temperature ramp: 1 data points),\n", " Chromatogram(C120127, Normal alkane RI, polar column, custom temperature program: 3 data points),\n", " Chromatogram(C120127, Lee's RI, non-polar column, isothermal: 5 data points),\n", " Chromatogram(C120127, Lee's RI, non-polar column, temperature ramp: 41 data points),\n", " Chromatogram(C120127, Lee's RI, non-polar column, custom temperature program: 35 data points),\n", " Chromatogram(C120127, Lee's RI, polar column, temperature ramp: 1 data points)]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X.get_gas_chromatography()\n", "X.gas_chromat" ] }, { "cell_type": "markdown", "id": "fdafccdd-0a6b-4538-9ece-e01383982399", "metadata": {}, "source": [ "`Chromatogram` object contains `pandas` DataFrames with raw GC data:" ] }, { "cell_type": "code", "execution_count": 12, "id": "59abf017-1d10-43ab-a9fc-0b07f65282ec", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(NistCompound(ID=C120127), \"Kovats' RI\", 'non-polar column', 'isothermal')" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gc = X.gas_chromat[0]\n", "gc.compound, gc.ri_type, gc.column_type, gc.temp_regime" ] }, { "cell_type": "code", "execution_count": 13, "id": "9a999133-b49f-4b64-879a-cebaabe4ba7f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Column typeActive phaseColumn length (m)Carrier gasSubstrateColumn diameter (mm)Phase thickness (μm)Temperature (C)IReferenceComment
0PackedSE-301.Chromaton N-AW160.1804.Kurbatova, S.V.; Finkelstein, E.E.; Kolosova, ...MSDC-RI
1CapillaryOV-125.N20.20.33150.1739.Zhang, M.; Chen, B.; Shen, S.; Chen, S., Compo...MSDC-RI
2CapillaryOV-125.N20.20.33160.1752.Zhang, M.; Chen, B.; Shen, S.; Chen, S., Compo...MSDC-RI
3CapillarySE-30100.0.5175.1769.Bredael, P., Retention indices of hydrocarbons...MSDC-RI
4CapillaryOV-10150.N20.3140.1729.0Gerasimenko, V.A.; Kirilenko, A.V.; Nabivach, ...MSDC-RI
\n", "
" ], "text/plain": [ " Column type Active phase Column length (m) Carrier gas Substrate \\\n", "0 Packed SE-30 1. Chromaton N-AW \n", "1 Capillary OV-1 25. N2 \n", "2 Capillary OV-1 25. N2 \n", "3 Capillary SE-30 100. \n", "4 Capillary OV-101 50. N2 \n", "\n", " Column diameter (mm) Phase thickness (μm) Temperature (C) I \\\n", "0 160. 1804. \n", "1 0.2 0.33 150. 1739. \n", "2 0.2 0.33 160. 1752. \n", "3 0.5 175. 1769. \n", "4 0.3 140. 1729.0 \n", "\n", " Reference Comment \n", "0 Kurbatova, S.V.; Finkelstein, E.E.; Kolosova, ... MSDC-RI \n", "1 Zhang, M.; Chen, B.; Shen, S.; Chen, S., Compo... MSDC-RI \n", "2 Zhang, M.; Chen, B.; Shen, S.; Chen, S., Compo... MSDC-RI \n", "3 Bredael, P., Retention indices of hydrocarbons... MSDC-RI \n", "4 Gerasimenko, V.A.; Kirilenko, A.V.; Nabivach, ... MSDC-RI " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gc.data" ] }, { "cell_type": "markdown", "id": "21dc7ef8-55e9-48e7-bb6b-f9b4cfabe688", "metadata": {}, "source": [ "Gas chromatography data of the given compound can be saved to the given directory via the `save_gas_chromatography` method of the `nist.compound.NistCompound` object. To save the specific table, one can use the `save` method of the `nist.compound.Chromatogram` object." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.14" } }, "nbformat": 4, "nbformat_minor": 5 }