NistChemPy

Cookbook:

  • Basic Search
    • Basic search
    • Search Parameters
    • Limit of Found Compounds
  • Compound Properties
  • Advanced Search
  • Requests Configuration

Package details:

  • Package API
  • Changelog
NistChemPy
  • Basic Search
  • View page source

Basic Search

Basic search

There are five available search types:

  • by name (search_type = 'name');

  • by InChI (search_type = 'inchi');

  • by CAS RN (search_type = 'cas');

  • by chemical formula (search_type = 'formula');

  • and by NIST Compound ID (search_type = 'id'):

[1]:
import nistchempy as nist

s = nist.run_search(identifier = '1,2,3*-butane', search_type = 'name')
s
[1]:
NistSearch(success=True, num_compounds=10, lost=False)

List of found compounds is stored in the compound_ids attribute, and the compounds can be retrieved via the load_found_compounds method:

[2]:
s.compound_ids
[2]:
['C1871585',
 'C18338404',
 'C298180',
 'C1529686',
 'C632053',
 'C13138517',
 'C62521691',
 'C76397234',
 'C101257798',
 'C1464535']
[3]:
s.load_found_compounds()
s.compounds
[3]:
[NistCompound(ID=C1871585),
 NistCompound(ID=C18338404),
 NistCompound(ID=C298180),
 NistCompound(ID=C1529686),
 NistCompound(ID=C632053),
 NistCompound(ID=C13138517),
 NistCompound(ID=C62521691),
 NistCompound(ID=C76397234),
 NistCompound(ID=C101257798),
 NistCompound(ID=C1464535)]

Search Parameters

In addition to the main identifier, you can limit the search using several parameters, which can be using the print_search_params function:

[4]:
nist.print_search_parameters()
use_SI         :   Units for thermodynamic data, "SI" if True and "calories" if False
match_isotopes :   Exactly match the specified isotopes (formula search only)
allow_other    :   Allow elements not specified in formula (formula search only)
allow_extra    :   Allow more atoms of elements in formula than specified (formula search only)
no_ion         :   Exclude ions from the search (formula search only)
cTG            :   Gas phase thermochemistry data
cTC            :   Condensed phase thermochemistry data
cTP            :   Phase change data
cTR            :   Reaction thermochemistry data
cIE            :   Gas phase ion energetics data
cIC            :   Ion clustering data
cIR            :   IR Spectrum
cTZ            :   THz IR spectrum
cMS            :   Mass spectrum (electron ionization)
cUV            :   UV/Visible spectrum
cGC            :   Gas Chromatography
cES            :   Vibrational and/or electronic energy levels
cDI            :   Constants of diatomic molecules
cSO            :   Henry's Law data

These options can be specified as arguments of the nist.search function or defined in nist.NistSearchParameters object:

[5]:
# query
identifier = 'C4H?Cl2'
search_type = 'formula'

# direct search (entries with IR spectra)
s1 = nist.run_search(identifier, search_type, cIR = True)

# search with NistSearchParameters
params = nist.NistSearchParameters(cIR = True)
s2 = nist.run_search(identifier, search_type, params)

# compare searches
print(sorted(s1.compound_ids))
print(sorted(s2.compound_ids))
['C110565', 'C110576', 'C1190223', 'C4028562', 'C4279225', 'C541333', 'C594376', 'C616217', 'C7581977', 'C760236', 'C764410', 'C821103', 'C926578']
['C110565', 'C110576', 'C1190223', 'C4028562', 'C4279225', 'C541333', 'C594376', 'C616217', 'C7581977', 'C760236', 'C764410', 'C821103', 'C926578']

Limit of Found Compounds

NIST Chemistry WebBook limits the search results by 400 compounds. To check if that happened for your search, you need to check the lost property:

[6]:
params = nist.NistSearchParameters(no_ion = True, cMS = True)
s = nist.run_search('C6H?O?', 'formula', params)
s
[6]:
NistSearch(success=True, num_compounds=400, lost=True)

To overcome that when searching for a large number of substances, try to break the chemical formula into subsets:

[7]:
sub_searches = []
for i in range(1, 7):
    s = nist.run_search(f'C6H?O{i}', 'formula', params)
    sub_searches.append( (len(s.compound_ids), s.lost) )
sub_searches
[7]:
[(170, False), (178, False), (80, False), (42, False), (7, False), (24, False)]

The better way is to overcome this problem is to use the pre-prepared compound list. For more details see the Structure Search page of the CookBook.

Previous Next

© Copyright 2023, Ivan Yu. Chernyshov.

Built with Sphinx using a theme provided by Read the Docs.