NistChemPy API

nistchempy

This package is a Python interface for the NIST Chemistry WebBook database that provides additional data for the efficient compound search and automatic retrievement of the stored physico-chemical data

nistchempy.get_all_data() DataFrame

Returns pandas dataframe containing info on all NIST Chem WebBook compounds

Returns:

dataframe containing pre-extracted compound info

Return type:

_pd.core.frame.DataFrame

nistchempy.get_compound(ID: str, **kwargs) NistCompound | None

Loads the main info on the given NIST compound

Parameters:
  • ID (str) – NIST compound ID, CAS RN or InChI

  • kwargs – requests.get kwargs parameters

Returns:

NistCompound object, and None if there are several compounds corresponding to the given ID

Return type:

_tp.Optional[NistCompound]

nistchempy.run_search(identifier: str, search_type: str, search_parameters: NistSearchParameters | None = None, use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False, **kwargs) NistSearch

Searches compounds in NIST Chemistry WebBook

Parameters:
  • identifier (str) – NIST compound ID / formula / name / inchi / CAS RN

  • search_type (str) – identifier type, available options are: - ‘formula’ - ‘name’ - ‘inchi’ - ‘cas’ - ‘id’

  • search_parameters (_tp.Optional[NistSearchParameters]) – search parameters; if provided, the following search parameter arguments are ignored

  • use_SI (bool) – if True, returns results in SI units. otherwise calories are used

  • match_isotopes (bool) – if True, exactly matches the specified isotopes (formula search only)

  • allow_other (bool) – if True, allows elements not specified in formula (formula search only)

  • allow_extra (bool) – if True, allows more atoms of elements in formula than specified (formula search only)

  • no_ion (bool) – if True, excludes ions from the search (formula search only)

  • cTG (bool) – if True, returns entries containing gas-phase thermodynamic data

  • cTC (bool) – if True, returns entries containing condensed-phase thermodynamic data

  • cTP (bool) – if True, returns entries containing phase-change thermodynamic data

  • cTR (bool) – if True, returns entries containing reaction thermodynamic data

  • cIE (bool) – if True, returns entries containing ion energetics thermodynamic data

  • cIC (bool) – if True, returns entries containing ion cluster thermodynamic data

  • cIR (bool) – if True, returns entries containing IR data

  • cTZ (bool) – if True, returns entries containing THz IR data

  • cMS (bool) – if True, returns entries containing MS data

  • cUV (bool) – if True, returns entries containing UV/Vis data

  • cGC (bool) – if True, returns entries containing gas chromatography data

  • cES (bool) – if True, returns entries containing vibrational and electronic energy levels

  • cDI (bool) – if True, returns entries containing constants of diatomic molecules

  • cSO (bool) – if True, returns entries containing info on Henry’s law

  • kwargs – requests.get parameters

Returns:

search object containing info on found compounds

Return type:

NistSearch

class nistchempy.NistSearchParameters(use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False)

Bases: object

GET parameters for compound search of NIST Chemistry WebBook

use_SI

if True, returns results in SI units. otherwise calories are used

Type:

bool

match_isotopes

if True, exactly matches the specified isotopes (formula search only)

Type:

bool

allow_other

if True, allows elements not specified in formula (formula search only)

Type:

bool

allow_extra

if True, allows more atoms of elements in formula than specified (formula search only)

Type:

bool

no_ion

if True, excludes ions from the search (formula search only)

Type:

bool

cTG

if True, returns entries containing gas-phase thermodynamic data

Type:

bool

cTC

if True, returns entries containing condensed-phase thermodynamic data

Type:

bool

cTP

if True, returns entries containing phase-change thermodynamic data

Type:

bool

cTR

if True, returns entries containing reaction thermodynamic data

Type:

bool

cIE

if True, returns entries containing ion energetics thermodynamic data

Type:

bool

cIC

if True, returns entries containing ion cluster thermodynamic data

Type:

bool

cIR

if True, returns entries containing IR data

Type:

bool

cTZ

if True, returns entries containing THz IR data

Type:

bool

cMS

if True, returns entries containing MS data

Type:

bool

cUV

if True, returns entries containing UV/Vis data

Type:

bool

cGC

if True, returns entries containing gas chromatography data

Type:

bool

cES

if True, returns entries containing vibrational and electronic energy levels

Type:

bool

cDI

if True, returns entries containing constants of diatomic molecules

Type:

bool

cSO

if True, returns entries containing info on Henry’s law

Type:

bool

use_SI: bool = True
match_isotopes: bool = False
allow_other: bool = False
allow_extra: bool = False
no_ion: bool = False
cTG: bool = False
cTC: bool = False
cTP: bool = False
cTR: bool = False
cIE: bool = False
cIC: bool = False
cIR: bool = False
cTZ: bool = False
cMS: bool = False
cUV: bool = False
cGC: bool = False
cES: bool = False
cDI: bool = False
cSO: bool = False
get_request_parameters() dict

Returns dictionary containing GET parameters

Returns:

dictionary of GET parameters relevant to the search

Return type:

dict

nistchempy.get_search_parameters() Dict[str, str]

Returns search parameters and the corresponding keys

Returns:

{short_key => search_parameter}

Return type:

_tp.Dict[str, str]

nistchempy.print_search_parameters() None

Prints available search parameters

nistchempy.requests

Request wrappers for NIST Chemistry WebBook APIs

nistchempy.requests.BASE_URL

base URL of the NIST Chemistry WebBook database

Type:

str

nistchempy.requests.SEARCH_URL

relative URL for the search API

Type:

str

nistchempy.requests.fix_html(html: str) str

Fixes detected typos in html code of NIST Chem WebBook web pages

Parameters:

html (str) – text of html-file

Returns:

fixed html-file

Return type:

str

class nistchempy.requests.NistResponse(response: Response)

Bases: object

Describes response to the GET request to the NIST Chemistry WebBook

response

request’s response

Type:

_requests.models.Response

ok

True if request’s status code is less than 400

Type:

bool

content_type

content type of the response

Type:

_tp.Optional[str]

text

text of the response

Type:

_tp.Optional[str]

soup

BeautifulSoup object of the html response

Type:

_tp.Optional[_bs4.BeautifulSoup]

response: Response
ok: bool
content_type: str | None
text: str | None
soup: BeautifulSoup | None = None
nistchempy.requests.make_nist_request(url: str, params: dict = {}, **kwargs) NistResponse

Dummy request to the NIST Chemistry WebBook

Parameters:
  • url (str) – URL of the NIST webpage

  • params (str) – GET request parameters

  • kwargs – requests.get kwargs parameters

Returns:

wrapper for the request’s response

Return type:

NistResponse

nistchempy.compound

The module contains compound-related functionality

nistchempy.compound.SPEC_TYPES

dictionary containing abbreviations for spectra types used in compound page (keys) or urls for downloading JDX-files (values)

Type:

dict

nistchempy.compound.urlparse(url, scheme='', allow_fragments=True)

Parse a URL into 6 components: <scheme>://<netloc>/<path>;<params>?<query>#<fragment>

The result is a named 6-tuple with fields corresponding to the above. It is either a ParseResult or ParseResultBytes object, depending on the type of the url parameter.

The username, password, hostname, and port sub-components of netloc can also be accessed as attributes of the returned object.

The scheme argument provides the default value of the scheme component when no scheme is found in url.

If allow_fragments is False, no attempt is made to separate the fragment component from the previous component, which can be either path or query.

Note that % escapes are not expanded.

nistchempy.compound.parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&')

Parse a query given as a string argument.

Arguments:

qs: percent-encoded query string to be parsed

keep_blank_values: flag indicating whether blank values in

percent-encoded queries should be treated as blank strings. A true value indicates that blanks should be retained as blank strings. The default false value indicates that blank values are to be ignored and treated as if they were not included.

strict_parsing: flag indicating what to do with parsing errors.

If false (the default), errors are silently ignored. If true, errors raise a ValueError exception.

encoding and errors: specify how to decode percent-encoded sequences

into Unicode characters, as accepted by the bytes.decode() method.

max_num_fields: int. If set, then throws a ValueError if there

are more than n fields read by parse_qsl().

separator: str. The symbol to use for separating the query arguments.

Defaults to &.

Returns a dictionary.

class nistchempy.compound.Spectrum(compound: NistCompound, spec_type: str, spec_idx: str, jdx_text: str)

Bases: object

Wrapper for IR, MS, and UV-Vis extracted from NIST Chemistry WebBook

compound

parent NistCompound object

Type:

NistCompound

spec_type

IR / TZ (THz IR) / MS / UV (UV-Vis)

Type:

str

spec_idx

index of the spectrum

Type:

str

jdx_text

text block of the corresponding JDX-file

Type:

str

compound: NistCompound
spec_type: str
spec_idx: str
jdx_text: str
save(name=None, path_dir=None)

Saves spectrum in JDX format

class nistchempy.compound.NistCompound(ID: str | None, name: str | None, synonyms: List[str], formula: str | None, mol_weight: float | None, inchi: str | None, inchi_key: str | None, cas_rn: str | None, mol_refs: Dict[str, str], data_refs: Dict[str, str], nist_public_refs: Dict[str, str], nist_subscription_refs: Dict[str, str], nist_response: NistResponse)

Bases: object

Stores info on NIST Chemistry WebBook compound

ID

NIST compound ID

Type:

_tp.Optional[str]

name

chemical name

Type:

_tp.Optional[str]

synonyms

synonyms of the chemical name

Type:

_tp.List[str]

formula

chemical formula

Type:

_tp.Optional[str]

mol_weight

molecular weigth, g/cm^3

Type:

_tp.Optional[float]

inchi

InChI string

Type:

_tp.Optional[str]

inchi_key

InChI key string

Type:

_tp.Optional[str]

cas_rn

CAS registry number

Type:

_tp.Optional[str]

mol_refs

references to 2D and 3D MOL-files

Type:

_tp.Dict[str, str]

data_refs

references to the webpages containing physical chemical data for the given compound

Type:

_tp.Dict[str, str]

nist_public_refs

references to webpages of other public NIST databases containing data for the given compound

Type:

_tp.Dict[str, str]

nist_subscription_refs

references to webpages of subscription NIST databases containing data for the given compound

Type:

_tp.Dict[str, str]

nist_response

response to the GET request

Type:

NistResponse

mol2D

text block of a MOL-file containing 2D atomic coordinates

Type:

_tp.Optional[str]

mol3D

text block of a MOL-file containing 3D atomic coordinates

Type:

_tp.Optional[str]

ir_specs

list pf IR Spectrum objects

Type:

_tp.List[Spectrum]

thz_specs

list pf THz Spectrum objects

Type:

_tp.List[Spectrum]

ms_specs

list pf MS Spectrum objects

Type:

_tp.List[Spectrum]

uv_specs

list pf UV-Vis Spectrum objects

Type:

_tp.List[Spectrum]

ID: str | None
name: str | None
synonyms: List[str]
formula: str | None
mol_weight: float | None
inchi: str | None
inchi_key: str | None
cas_rn: str | None
mol_refs: Dict[str, str]
data_refs: Dict[str, str]
nist_public_refs: Dict[str, str]
nist_subscription_refs: Dict[str, str]
nist_response: NistResponse
mol2D: str | None
mol3D: str | None
ir_specs: List[Spectrum]
thz_specs: List[Spectrum]
ms_specs: List[Spectrum]
uv_specs: List[Spectrum]
get_molfile(dim: int, **kwargs) None

Loads text block of 2D / 3D molfile

Parameters:
  • dim (int) – dimensionality of molfile (2D / 3D)

  • kwargs – requests.get kwargs parameters

get_mol2D(**kwargs) None

Loads text block of 2D molfile

Parameters:

kwargs – requests.get kwargs parameters

get_mol3D(**kwargs) None

Loads text block of 2D molfile

Parameters:

kwargs – requests.get kwargs parameters

get_molfiles(**kwargs) None

Loads text block of all available molfiles

Parameters:

kwargs – requests.get kwargs parameters

get_spectrum(spec_type: str, spec_idx: str) Spectrum

Loads spectrum of given type (IR / TZ / MS / UV) and index

Parameters:
  • spec_type (str) – spectrum type [ IR / TZ / MS / UV ]

  • spec_idx (str) – spectrum index

Returns:

wrapper for the text block of JDX-formatted spectrum

Return type:

Spectrum

get_spectra(spec_type: str) None

Loads all available spectra of given type (IR / TZ / MS / UV)

Parameters:

spec_type (str) – spectrum type [ IR / TZ / MS / UV ]

get_ir_spectra()

Loads all available IR spectra

get_thz_spectra()

Loads all available THz spectra

get_ms_spectra()

Loads all available MS spectra

get_uv_spectra()

Loads all available UV-Vis spectra

get_all_spectra()

Loads all available spectra

save_spectra(spec_type, path_dir='./') None

Saves all spectra of given type to the specified folder

Parameters:
  • spec_type (str) – spectrum type [ IR / TZ / MS / UV ]

  • path_dir (str) – directory to save spectra

save_ir_spectra(path_dir='./') None

Saves IR spectra to the specified folder

save_thz_spectra(path_dir='./') None

Saves IR spectra to the specified folder

save_ms_spectra(path_dir='./') None

Saves mass spectra to the specified folder

save_uv_spectra(path_dir='./') None

Saves all UV-Vis spectra to the specified folder

save_all_spectra(path_dir='./') None

Saves all UV-Vis spectra to the specified folder

nistchempy.compound.compound_from_response(nr: NistResponse) NistCompound | None

Initializes NistCompound object from the corresponding response

Parameters:

nr (_ncpr.NistResponse) – response to the GET request for a compound

Returns:

NistCompound object, and None if there are several compounds corresponding to the given ID

Return type:

_tp.Optional[NistCompound]

nistchempy.compound.get_compound(ID: str, **kwargs) NistCompound | None

Loads the main info on the given NIST compound

Parameters:
  • ID (str) – NIST compound ID, CAS RN or InChI

  • kwargs – requests.get kwargs parameters

Returns:

NistCompound object, and None if there are several compounds corresponding to the given ID

Return type:

_tp.Optional[NistCompound]

nistchempy.compound_list

Loads pre-prepared info on compounds structure and data availability

nistchempy.compound_list.get_all_data() DataFrame

Returns pandas dataframe containing info on all NIST Chem WebBook compounds

Returns:

dataframe containing pre-extracted compound info

Return type:

_pd.core.frame.DataFrame