NistChemPy API
nistchempy
This package is a Python interface for the NIST Chemistry WebBook database that provides additional data for the efficient compound search and automatic retrievement of the stored physico-chemical data
- nistchempy.get_all_data() DataFrame
 Returns pandas dataframe containing info on all NIST Chem WebBook compounds
- Returns:
 dataframe containing pre-extracted compound info
- Return type:
 _pd.core.frame.DataFrame
- nistchempy.get_compound(ID: str, request_config: RequestConfig | None = None) NistCompound | None
 Loads the main info on the given NIST compound
- Parameters:
 ID (str) – NIST compound ID, CAS RN or InChI
request_config (_tp.Optional[_ncpr.RequestConfig]) – additional requests.get parameters
- Returns:
 NistCompound object, and None if there are several compounds corresponding to the given ID
- Return type:
 _tp.Optional[NistCompound]
- nistchempy.run_search(identifier: str, search_type: str, search_parameters: NistSearchParameters | None = None, request_config: RequestConfig | None = None, use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False) NistSearch
 Searches compounds in NIST Chemistry WebBook
- Parameters:
 identifier (str) – NIST compound ID / formula / name / inchi / CAS RN
search_type (str) – identifier type, available options are: - ‘formula’ - ‘name’ - ‘inchi’ - ‘cas’ - ‘id’
search_parameters (_tp.Optional[NistSearchParameters]) – search parameters; if provided, the following search parameter arguments are ignored
request_config (_tp.Optional[_ncpr.RequestConfig]) – additional requests.get parameters
use_SI (bool) – if True, returns results in SI units. otherwise calories are used
match_isotopes (bool) – if True, exactly matches the specified isotopes (formula search only)
allow_other (bool) – if True, allows elements not specified in formula (formula search only)
allow_extra (bool) – if True, allows more atoms of elements in formula than specified (formula search only)
no_ion (bool) – if True, excludes ions from the search (formula search only)
cTG (bool) – if True, returns entries containing gas-phase thermodynamic data
cTC (bool) – if True, returns entries containing condensed-phase thermodynamic data
cTP (bool) – if True, returns entries containing phase-change thermodynamic data
cTR (bool) – if True, returns entries containing reaction thermodynamic data
cIE (bool) – if True, returns entries containing ion energetics thermodynamic data
cIC (bool) – if True, returns entries containing ion cluster thermodynamic data
cIR (bool) – if True, returns entries containing IR data
cTZ (bool) – if True, returns entries containing THz IR data
cMS (bool) – if True, returns entries containing MS data
cUV (bool) – if True, returns entries containing UV/Vis data
cGC (bool) – if True, returns entries containing gas chromatography data
cES (bool) – if True, returns entries containing vibrational and electronic energy levels
cDI (bool) – if True, returns entries containing constants of diatomic molecules
cSO (bool) – if True, returns entries containing info on Henry’s law
- Returns:
 search object containing info on found compounds
- Return type:
 NistSearch
- nistchempy.run_structural_search(molfile: str | None = None, molblock: str | None = None, search_type: str = 'sub', search_parameters: NistSearchParameters | None = None, request_config: RequestConfig | None = None, use_SI: bool = True, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False) NistSearch
 Runs (sub)structural search for compounds in NIST Chemistry WebBook
- Parameters:
 molfile (_tp.Optional[str]) – path to the MOL-file of the structure to search; if specified, molblock is ignored
molblock (_tp.Optional[str]) – text of the MOL-file of the structure to search
search_type (str) – type of structural search, available options are: - ‘struct’: exact match - ‘sub’: substructure search (default)
search_parameters (_tp.Optional[NistSearchParameters]) – search parameters; if provided, the following search parameter arguments are ignored
request_config (_tp.Optional[_ncpr.RequestConfig]) – additional requests.get parameters
use_SI (bool) – if True, returns results in SI units. otherwise calories are used
cTG (bool) – if True, returns entries containing gas-phase thermodynamic data
cTC (bool) – if True, returns entries containing condensed-phase thermodynamic data
cTP (bool) – if True, returns entries containing phase-change thermodynamic data
cTR (bool) – if True, returns entries containing reaction thermodynamic data
cIE (bool) – if True, returns entries containing ion energetics thermodynamic data
cIC (bool) – if True, returns entries containing ion cluster thermodynamic data
cIR (bool) – if True, returns entries containing IR data
cTZ (bool) – if True, returns entries containing THz IR data
cMS (bool) – if True, returns entries containing MS data
cUV (bool) – if True, returns entries containing UV/Vis data
cGC (bool) – if True, returns entries containing gas chromatography data
cES (bool) – if True, returns entries containing vibrational and electronic energy levels
cDI (bool) – if True, returns entries containing constants of diatomic molecules
cSO (bool) – if True, returns entries containing info on Henry’s law
- Returns:
 search object containing info on found compounds
- Return type:
 NistSearch
- class nistchempy.NistSearchParameters(use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False)
 Bases:
objectGET parameters for compound search of NIST Chemistry WebBook
- use_SI
 if True, returns results in SI units. otherwise calories are used
- Type:
 bool
- match_isotopes
 if True, exactly matches the specified isotopes (formula search only)
- Type:
 bool
- allow_other
 if True, allows elements not specified in formula (formula search only)
- Type:
 bool
- allow_extra
 if True, allows more atoms of elements in formula than specified (formula search only)
- Type:
 bool
- no_ion
 if True, excludes ions from the search (formula search only)
- Type:
 bool
- cTG
 if True, returns entries containing gas-phase thermodynamic data
- Type:
 bool
- cTC
 if True, returns entries containing condensed-phase thermodynamic data
- Type:
 bool
- cTP
 if True, returns entries containing phase-change thermodynamic data
- Type:
 bool
- cTR
 if True, returns entries containing reaction thermodynamic data
- Type:
 bool
- cIE
 if True, returns entries containing ion energetics thermodynamic data
- Type:
 bool
- cIC
 if True, returns entries containing ion cluster thermodynamic data
- Type:
 bool
- cIR
 if True, returns entries containing IR data
- Type:
 bool
- cTZ
 if True, returns entries containing THz IR data
- Type:
 bool
- cMS
 if True, returns entries containing MS data
- Type:
 bool
- cUV
 if True, returns entries containing UV/Vis data
- Type:
 bool
- cGC
 if True, returns entries containing gas chromatography data
- Type:
 bool
- cES
 if True, returns entries containing vibrational and electronic energy levels
- Type:
 bool
- cDI
 if True, returns entries containing constants of diatomic molecules
- Type:
 bool
- cSO
 if True, returns entries containing info on Henry’s law
- Type:
 bool
- use_SI: bool = True
 
- match_isotopes: bool = False
 
- allow_other: bool = False
 
- allow_extra: bool = False
 
- no_ion: bool = False
 
- cTG: bool = False
 
- cTC: bool = False
 
- cTP: bool = False
 
- cTR: bool = False
 
- cIE: bool = False
 
- cIC: bool = False
 
- cIR: bool = False
 
- cTZ: bool = False
 
- cMS: bool = False
 
- cUV: bool = False
 
- cGC: bool = False
 
- cES: bool = False
 
- cDI: bool = False
 
- cSO: bool = False
 
- get_request_parameters() dict
 Returns dictionary containing GET parameters
- Returns:
 dictionary of GET parameters relevant to the search
- Return type:
 dict
- nistchempy.get_search_parameters() Dict[str, str]
 Returns search parameters and the corresponding keys
- Returns:
 {short_key => search_parameter}
- Return type:
 _tp.Dict[str, str]
- nistchempy.print_search_parameters() None
 Prints available search parameters
- class nistchempy.RequestConfig(delay: float = 0.0, max_attempts: int | None = 1, kwargs: dict = <factory>)
 Bases:
objectContains parameters used by make_nist_request function
- Attrubutes:
 delay (float): time delay in seconds after getting response from NIST max_attempts (_tp.Optional[int]): if > 1, enables reattempting of getting response
in case of request errors or non-OK response
kwargs (dict): kwargs for requests.get inside of make_nist_request
- delay: float = 0.0
 
- max_attempts: int | None = 1
 
- kwargs: dict
 
- nistchempy.get_crawl_delay(useragent: str = '*', config: RequestConfig | None = None) float
 Returns NIST Chemistry Webbook’s crawl delay for the given user agent
- nistchempy.useragent
 user agent
- Type:
 str
- Returns:
 crawl delay in seconds
- Return type:
 float
nistchempy.compound
The module contains compound-related functionality
- nistchempy.compound.SPEC_TYPES
 dictionary containing abbreviations for spectra types used in compound page (keys) or urls for downloading JDX-files (values)
- Type:
 dict
- nistchempy.compound.urlparse(url, scheme='', allow_fragments=True)
 Parse a URL into 6 components: <scheme>://<netloc>/<path>;<params>?<query>#<fragment>
The result is a named 6-tuple with fields corresponding to the above. It is either a ParseResult or ParseResultBytes object, depending on the type of the url parameter.
The username, password, hostname, and port sub-components of netloc can also be accessed as attributes of the returned object.
The scheme argument provides the default value of the scheme component when no scheme is found in url.
If allow_fragments is False, no attempt is made to separate the fragment component from the previous component, which can be either path or query.
Note that % escapes are not expanded.
- nistchempy.compound.parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&')
 Parse a query given as a string argument.
Arguments:
qs: percent-encoded query string to be parsed
- keep_blank_values: flag indicating whether blank values in
 percent-encoded queries should be treated as blank strings. A true value indicates that blanks should be retained as blank strings. The default false value indicates that blank values are to be ignored and treated as if they were not included.
- strict_parsing: flag indicating what to do with parsing errors.
 If false (the default), errors are silently ignored. If true, errors raise a ValueError exception.
- encoding and errors: specify how to decode percent-encoded sequences
 into Unicode characters, as accepted by the bytes.decode() method.
- max_num_fields: int. If set, then throws a ValueError if there
 are more than n fields read by parse_qsl().
- separator: str. The symbol to use for separating the query arguments.
 Defaults to &.
Returns a dictionary.
- class nistchempy.compound.Spectrum(compound: NistCompound, spec_type: str, spec_idx: str, jdx_text: str)
 Bases:
objectWrapper for IR, MS, and UV-Vis extracted from NIST Chemistry WebBook
- compound
 parent NistCompound object
- Type:
 NistCompound
- spec_type
 IR / TZ (THz IR) / MS / UV (UV-Vis)
- Type:
 str
- spec_idx
 index of the spectrum
- Type:
 str
- jdx_text
 text block of the corresponding JDX-file
- Type:
 str
- compound: NistCompound
 
- spec_type: str
 
- spec_idx: str
 
- jdx_text: str
 
- save(name: str = None, path_dir: str = None) None
 Saves spectrum in JDX format
- name
 custom filename (default name is formed from compound ID, spectrum type and index)
- Type:
 str
- path_dir
 directory where output file will be saved
- Type:
 str
- class nistchempy.compound.Chromatogram(compound: NistCompound, ri_type: str, column_type: str, temp_regime: str, data: DataFrame)
 Bases:
objectWrapper chromatography data extracted from NIST Chemistry WebBook
- compound
 parent NistCompound object
- Type:
 NistCompound
- ri_type
 type of retention index: Kovatz, van den Dool & Kratz, etc.
- Type:
 str
- column_type
 polar / non-polar
- Type:
 str
- temp_regime
 temperature regime: isothermal / ramp / custom
- Type:
 str
- data
 experimental data
- Type:
 _pd.core.frame.DataFrame
- compound: NistCompound
 
- ri_type: str
 
- column_type: str
 
- temp_regime: str
 
- data: DataFrame
 
- save(name: str = None, path_dir: str = None, **kwargs) None
 Saves chromatograms in CSV format
- name
 custom filename (default name is formed from compound ID, spectrum type and index)
- Type:
 str
- path_dir
 directory where output file will be saved
- Type:
 str
- kwargs
 parameters for pandas DataFrame to_csv method
- class nistchempy.compound.NistCompound(_request_config: RequestConfig, _nist_response: NistResponse, ID: str | None, name: str | None, synonyms: List[str], formula: str | None, mol_weight: float | None, inchi: str | None, inchi_key: str | None, cas_rn: str | None, mol_refs: Dict[str, str], data_refs: Dict[str, str], nist_public_refs: Dict[str, str], nist_subscription_refs: Dict[str, str])
 Bases:
objectStores info on NIST Chemistry WebBook compound
- _request_config
 additional requests.get parameters
- Type:
 _ncpr.RequestConfig
- _nist_response
 response to the GET request
- Type:
 _ncpr.NistResponse
- ID
 NIST compound ID
- Type:
 _tp.Optional[str]
- name
 chemical name
- Type:
 _tp.Optional[str]
- synonyms
 synonyms of the chemical name
- Type:
 _tp.List[str]
- formula
 chemical formula
- Type:
 _tp.Optional[str]
- mol_weight
 molecular weigth, g/cm^3
- Type:
 _tp.Optional[float]
- inchi
 InChI string
- Type:
 _tp.Optional[str]
- inchi_key
 InChI key string
- Type:
 _tp.Optional[str]
- cas_rn
 CAS registry number
- Type:
 _tp.Optional[str]
- mol_refs
 references to 2D and 3D MOL-files
- Type:
 _tp.Dict[str, str]
- data_refs
 references to the webpages containing physical chemical data for the given compound
- Type:
 _tp.Dict[str, str]
- nist_public_refs
 references to webpages of other public NIST databases containing data for the given compound
- Type:
 _tp.Dict[str, str]
- nist_subscription_refs
 references to webpages of subscription NIST databases containing data for the given compound
- Type:
 _tp.Dict[str, str]
- mol2D
 text block of a MOL-file containing 2D atomic coordinates
- Type:
 _tp.Optional[str]
- mol3D
 text block of a MOL-file containing 3D atomic coordinates
- Type:
 _tp.Optional[str]
- ir_specs
 list pf IR Spectrum objects
- Type:
 _tp.List[Spectrum]
- thz_specs
 list pf THz Spectrum objects
- Type:
 _tp.List[Spectrum]
- ms_specs
 list pf MS Spectrum objects
- Type:
 _tp.List[Spectrum]
- uv_specs
 list pf UV-Vis Spectrum objects
- Type:
 _tp.List[Spectrum]
- gas_chromat
 list of Chromatogram objects
- Type:
 _tp.List[Chromatogram]
- ID: str | None
 
- name: str | None
 
- synonyms: List[str]
 
- formula: str | None
 
- mol_weight: float | None
 
- inchi: str | None
 
- inchi_key: str | None
 
- cas_rn: str | None
 
- mol_refs: Dict[str, str]
 
- data_refs: Dict[str, str]
 
- nist_public_refs: Dict[str, str]
 
- nist_subscription_refs: Dict[str, str]
 
- mol2D: str | None
 
- mol3D: str | None
 
- ir_specs: List[Spectrum]
 
- thz_specs: List[Spectrum]
 
- ms_specs: List[Spectrum]
 
- uv_specs: List[Spectrum]
 
- gas_chromat: List[Chromatogram]
 
- get_molfile(dim: int) None
 Loads text block of 2D / 3D molfile
- Parameters:
 dim (int) – dimensionality of molfile (2D / 3D)
- get_mol2D() None
 Loads text block of 2D molfile
- get_mol3D() None
 Loads text block of 2D molfile
- get_molfiles() None
 Loads text block of all available molfiles
- get_spectrum(spec_type: str, spec_idx: str) Spectrum
 Loads spectrum of given type (IR / TZ / MS / UV) and index
- Parameters:
 spec_type (str) – spectrum type [ IR / TZ / MS / UV ]
spec_idx (str) – spectrum index
- Returns:
 wrapper for the text block of JDX-formatted spectrum
- Return type:
 Spectrum
- get_spectra(spec_type: str) None
 Loads all available spectra of given type (IR / TZ / MS / UV)
- Parameters:
 spec_type (str) – spectrum type [ IR / TZ / MS / UV ]
- get_ir_spectra() None
 Loads all available IR spectra
- get_thz_spectra() None
 Loads all available THz spectra
- get_ms_spectra() None
 Loads all available MS spectra
- get_uv_spectra() None
 Loads all available UV-Vis spectra
- get_all_spectra() None
 Loads all available spectra
- save_spectra(spec_type: str, path_dir: str = './') None
 Saves all spectra of given type to the specified folder
- Parameters:
 spec_type (str) – spectrum type [ IR / TZ / MS / UV ]
path_dir (str) – directory to save spectra
- save_ir_spectra(path_dir: str = './') None
 Saves IR spectra to the specified folder
- Parameters:
 path_dir (str) – directory to save spectra
- save_thz_spectra(path_dir: str = './') None
 Saves IR spectra to the specified folder
- Parameters:
 path_dir (str) – directory to save spectra
- save_ms_spectra(path_dir: str = './') None
 Saves mass spectra to the specified folder
- Parameters:
 path_dir (str) – directory to save spectra
- save_uv_spectra(path_dir: str = './') None
 Saves all UV-Vis spectra to the specified folder
- Parameters:
 path_dir (str) – directory to save spectra
- save_all_spectra(path_dir: str = './') None
 Saves all UV-Vis spectra to the specified folder
- Parameters:
 path_dir (str) – directory to save spectra
- get_gas_chromatography() None
 Loads info on gas chromatography
- save_gas_chromatography(path_dir: str = './', **kwargs) None
 Saves all tables with data on gas chromatohraphy experiments
- Parameters:
 path_dir (str) – directory to save spectra
- nistchempy.compound.compound_from_response(nr: NistResponse, request_config: RequestConfig | None = None) NistCompound | None
 Initializes NistCompound object from the corresponding response
- Parameters:
 nr (_ncpr.NistResponse) – response to the GET request for a compound
request_config (_tp.Optional[_ncpr.RequestConfig]) – additional requests.get parameters
- Returns:
 NistCompound object, and None if there are several compounds corresponding to the given ID
- Return type:
 _tp.Optional[NistCompound]
- nistchempy.compound.get_compound(ID: str, request_config: RequestConfig | None = None) NistCompound | None
 Loads the main info on the given NIST compound
- Parameters:
 ID (str) – NIST compound ID, CAS RN or InChI
request_config (_tp.Optional[_ncpr.RequestConfig]) – additional requests.get parameters
- Returns:
 NistCompound object, and None if there are several compounds corresponding to the given ID
- Return type:
 _tp.Optional[NistCompound]
nistchempy.search
The module contains search-related functionality
- nistchempy.search.get_search_parameters() Dict[str, str]
 Returns search parameters and the corresponding keys
- Returns:
 {short_key => search_parameter}
- Return type:
 _tp.Dict[str, str]
- nistchempy.search.print_search_parameters() None
 Prints available search parameters
- class nistchempy.search.NistSearchParameters(use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False)
 Bases:
objectGET parameters for compound search of NIST Chemistry WebBook
- use_SI
 if True, returns results in SI units. otherwise calories are used
- Type:
 bool
- match_isotopes
 if True, exactly matches the specified isotopes (formula search only)
- Type:
 bool
- allow_other
 if True, allows elements not specified in formula (formula search only)
- Type:
 bool
- allow_extra
 if True, allows more atoms of elements in formula than specified (formula search only)
- Type:
 bool
- no_ion
 if True, excludes ions from the search (formula search only)
- Type:
 bool
- cTG
 if True, returns entries containing gas-phase thermodynamic data
- Type:
 bool
- cTC
 if True, returns entries containing condensed-phase thermodynamic data
- Type:
 bool
- cTP
 if True, returns entries containing phase-change thermodynamic data
- Type:
 bool
- cTR
 if True, returns entries containing reaction thermodynamic data
- Type:
 bool
- cIE
 if True, returns entries containing ion energetics thermodynamic data
- Type:
 bool
- cIC
 if True, returns entries containing ion cluster thermodynamic data
- Type:
 bool
- cIR
 if True, returns entries containing IR data
- Type:
 bool
- cTZ
 if True, returns entries containing THz IR data
- Type:
 bool
- cMS
 if True, returns entries containing MS data
- Type:
 bool
- cUV
 if True, returns entries containing UV/Vis data
- Type:
 bool
- cGC
 if True, returns entries containing gas chromatography data
- Type:
 bool
- cES
 if True, returns entries containing vibrational and electronic energy levels
- Type:
 bool
- cDI
 if True, returns entries containing constants of diatomic molecules
- Type:
 bool
- cSO
 if True, returns entries containing info on Henry’s law
- Type:
 bool
- use_SI: bool = True
 
- match_isotopes: bool = False
 
- allow_other: bool = False
 
- allow_extra: bool = False
 
- no_ion: bool = False
 
- cTG: bool = False
 
- cTC: bool = False
 
- cTP: bool = False
 
- cTR: bool = False
 
- cIE: bool = False
 
- cIC: bool = False
 
- cIR: bool = False
 
- cTZ: bool = False
 
- cMS: bool = False
 
- cUV: bool = False
 
- cGC: bool = False
 
- cES: bool = False
 
- cDI: bool = False
 
- cSO: bool = False
 
- get_request_parameters() dict
 Returns dictionary containing GET parameters
- Returns:
 dictionary of GET parameters relevant to the search
- Return type:
 dict
- class nistchempy.search.NistSearch(_request_config: RequestConfig, _nist_response: NistResponse, search_parameters: NistSearchParameters, compound_ids: List[str], success: bool, lost: bool)
 Bases:
objectResults of the compound search in NIST Chemistry WebBook
- _request_config
 additional requests.get parameters
- Type:
 _ncpr.RequestConfig
- _nist_response
 NIST search response
- Type:
 NistResponse
- search_parameters
 used search parameters
- Type:
 NistSearchParameters
- compound_ids
 NIST IDs of found compounds
- Type:
 _tp.List[str]
- compounds
 NistCompound objects of found compounds
- Type:
 _tp.List[_compound.NistCompound]
- success
 True if search request was successful
- Type:
 bool
- num_compounds
 number of found compounds
- Type:
 int
- lost
 True if search returns less compounds than there are in the database
- Type:
 bool
- search_parameters: NistSearchParameters
 
- compound_ids: List[str]
 
- compounds: List[NistCompound]
 
- success: bool
 
- num_compounds: int
 
- lost: bool
 
- load_found_compounds() None
 Loads found compounds
- nistchempy.search.search_from_response(nr: NistResponse, search_parameters: NistSearchParameters, config: RequestConfig) NistSearch
 Transforms search requests to the NistSearch object
- Parameters:
 nr (_ncpr.NistResponse) – NIST response object
search_parameters (NistSearchParameters) – search request parameters
config (_ncpr.RequestConfig) – search request config
- Returns:
 search results
- Return type:
 NistSearch
- nistchempy.search.run_search(identifier: str, search_type: str, search_parameters: NistSearchParameters | None = None, request_config: RequestConfig | None = None, use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False) NistSearch
 Searches compounds in NIST Chemistry WebBook
- Parameters:
 identifier (str) – NIST compound ID / formula / name / inchi / CAS RN
search_type (str) – identifier type, available options are: - ‘formula’ - ‘name’ - ‘inchi’ - ‘cas’ - ‘id’
search_parameters (_tp.Optional[NistSearchParameters]) – search parameters; if provided, the following search parameter arguments are ignored
request_config (_tp.Optional[_ncpr.RequestConfig]) – additional requests.get parameters
use_SI (bool) – if True, returns results in SI units. otherwise calories are used
match_isotopes (bool) – if True, exactly matches the specified isotopes (formula search only)
allow_other (bool) – if True, allows elements not specified in formula (formula search only)
allow_extra (bool) – if True, allows more atoms of elements in formula than specified (formula search only)
no_ion (bool) – if True, excludes ions from the search (formula search only)
cTG (bool) – if True, returns entries containing gas-phase thermodynamic data
cTC (bool) – if True, returns entries containing condensed-phase thermodynamic data
cTP (bool) – if True, returns entries containing phase-change thermodynamic data
cTR (bool) – if True, returns entries containing reaction thermodynamic data
cIE (bool) – if True, returns entries containing ion energetics thermodynamic data
cIC (bool) – if True, returns entries containing ion cluster thermodynamic data
cIR (bool) – if True, returns entries containing IR data
cTZ (bool) – if True, returns entries containing THz IR data
cMS (bool) – if True, returns entries containing MS data
cUV (bool) – if True, returns entries containing UV/Vis data
cGC (bool) – if True, returns entries containing gas chromatography data
cES (bool) – if True, returns entries containing vibrational and electronic energy levels
cDI (bool) – if True, returns entries containing constants of diatomic molecules
cSO (bool) – if True, returns entries containing info on Henry’s law
- Returns:
 search object containing info on found compounds
- Return type:
 NistSearch
- nistchempy.search.run_structural_search(molfile: str | None = None, molblock: str | None = None, search_type: str = 'sub', search_parameters: NistSearchParameters | None = None, request_config: RequestConfig | None = None, use_SI: bool = True, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False) NistSearch
 Runs (sub)structural search for compounds in NIST Chemistry WebBook
- Parameters:
 molfile (_tp.Optional[str]) – path to the MOL-file of the structure to search; if specified, molblock is ignored
molblock (_tp.Optional[str]) – text of the MOL-file of the structure to search
search_type (str) – type of structural search, available options are: - ‘struct’: exact match - ‘sub’: substructure search (default)
search_parameters (_tp.Optional[NistSearchParameters]) – search parameters; if provided, the following search parameter arguments are ignored
request_config (_tp.Optional[_ncpr.RequestConfig]) – additional requests.get parameters
use_SI (bool) – if True, returns results in SI units. otherwise calories are used
cTG (bool) – if True, returns entries containing gas-phase thermodynamic data
cTC (bool) – if True, returns entries containing condensed-phase thermodynamic data
cTP (bool) – if True, returns entries containing phase-change thermodynamic data
cTR (bool) – if True, returns entries containing reaction thermodynamic data
cIE (bool) – if True, returns entries containing ion energetics thermodynamic data
cIC (bool) – if True, returns entries containing ion cluster thermodynamic data
cIR (bool) – if True, returns entries containing IR data
cTZ (bool) – if True, returns entries containing THz IR data
cMS (bool) – if True, returns entries containing MS data
cUV (bool) – if True, returns entries containing UV/Vis data
cGC (bool) – if True, returns entries containing gas chromatography data
cES (bool) – if True, returns entries containing vibrational and electronic energy levels
cDI (bool) – if True, returns entries containing constants of diatomic molecules
cSO (bool) – if True, returns entries containing info on Henry’s law
- Returns:
 search object containing info on found compounds
- Return type:
 NistSearch
nistchempy.compound_list
Loads pre-prepared info on compounds structure and data availability
- nistchempy.compound_list.get_all_data() DataFrame
 Returns pandas dataframe containing info on all NIST Chem WebBook compounds
- Returns:
 dataframe containing pre-extracted compound info
- Return type:
 _pd.core.frame.DataFrame
nistchempy.requests
Request wrappers for NIST Chemistry WebBook APIs
- nistchempy.requests.BASE_URL
 base URL of the NIST Chemistry WebBook database
- Type:
 str
- nistchempy.requests.SEARCH_URL
 relative URL for the search API
- Type:
 str
- nistchempy.requests.INCHI_URL
 relative URL for obtaining NIST compounds via InChI
- Type:
 str
- class nistchempy.requests.RequestConfig(delay: float = 0.0, max_attempts: int | None = 1, kwargs: dict = <factory>)
 Bases:
objectContains parameters used by make_nist_request function
- Attrubutes:
 delay (float): time delay in seconds after getting response from NIST max_attempts (_tp.Optional[int]): if > 1, enables reattempting of getting response
in case of request errors or non-OK response
kwargs (dict): kwargs for requests.get inside of make_nist_request
- delay: float = 0.0
 
- max_attempts: int | None = 1
 
- kwargs: dict
 
- nistchempy.requests.fix_html(html: str) str
 Fixes detected typos in html code of NIST Chem WebBook web pages
- Parameters:
 html (str) – text of html-file
- Returns:
 fixed html-file
- Return type:
 str
- class nistchempy.requests.NistResponse(response: Response)
 Bases:
objectDescribes response to the GET request to the NIST Chemistry WebBook
- response
 request’s response
- Type:
 _requests.models.Response
- ok
 True if request’s status code is less than 400
- Type:
 bool
- content_type
 content type of the response
- Type:
 _tp.Optional[str]
- text
 text of the response
- Type:
 _tp.Optional[str]
- soup
 BeautifulSoup object of the html response
- Type:
 _tp.Optional[_bs4.BeautifulSoup]
- response: Response
 
- ok: bool
 
- content_type: str | None
 
- text: str | None
 
- soup: BeautifulSoup | None = None
 
- nistchempy.requests.make_nist_request(url: str, params: dict = {}, config: RequestConfig | None = None) NistResponse
 Dummy GET request to the NIST Chemistry WebBook
- Parameters:
 url (str) – URL of the NIST webpage
params (dict) – GET request parameters
config (_tp.Optional[RequestConfig]) – additional requests.get parameters
- Returns:
 wrapper for the request’s response
- Return type:
 NistResponse
- nistchempy.requests.make_nist_post_request(url: str, data: dict = {}, json: dict = {}, files: dict = {}, config: RequestConfig | None = None) NistResponse
 Dummy GET request to the NIST Chemistry WebBook
- Parameters:
 url (str) – URL of the NIST webpage
data (dict) – POST data object to send in the body of the request
json (dict) – JSON serializable object to send in the body of the request
files (dict) – POST qwarg to send files in the body of the request
config (_tp.Optional[RequestConfig]) – additional requests.post parameters
- Returns:
 wrapper for the request’s response
- Return type:
 NistResponse
nistchempy.parsing
The module contains parsing-related functionality
- nistchempy.parsing.is_compound_page(soup: BeautifulSoup) bool
 Checks if html is a single compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 True for a single compound page
- Return type:
 bool
- nistchempy.parsing.get_found_compounds(soup: BeautifulSoup) dict
 Extracts IDs of found compounds for NIST Chemistry WebBook search
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 extracted NIST search parameters
- Return type:
 dict
- nistchempy.parsing.parse_compound_page(soup: BeautifulSoup) dict | None
 Parses Nist compound webpage and returns dictionary with extracted info
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 dictionary with extracted info and None if webpage does not correspond to single compound
- Return type:
 _tp.Optional[dict]
- nistchempy.parsing.get_chromatography_table_refs(soup: BeautifulSoup) List[str]
 Extracts references to large format tables containing info on chromatographic experiments
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 list of URLs
- Return type:
 _tp.List[str]
- nistchempy.parsing.parse_chromatography_table(soup: BeautifulSoup) dict
 Extracts references to large format tables containing info on chromatographic experiments
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 contains info to initialize nistchempy.compound.Chromatogram
- Return type:
 dict
nistchempy.parsing.compound
The module contains functionality to parse basic compound properties
- nistchempy.parsing.compound.get_found_compounds(soup: BeautifulSoup) dict
 Extracts IDs of found compounds for NIST Chemistry WebBook search
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 extracted NIST search parameters
- Return type:
 dict
- nistchempy.parsing.compound.is_compound_page(soup: BeautifulSoup) bool
 Checks if html is a single compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 True for a single compound page
- Return type:
 bool
- nistchempy.parsing.compound.get_compound_id_from_comment(soup: BeautifulSoup) str | None
 Extracts compound ID from commented field in Notes section
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 NIST compound ID, None if not detected
- Return type:
 _tp.Optional[str]
- nistchempy.parsing.compound.get_compound_id_from_units_switch(soup: BeautifulSoup) str | None
 Extracts compound ID from url to switch energy units
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 NIST compound ID, None if not detected
- Return type:
 _tp.Optional[str]
- nistchempy.parsing.compound.get_compound_id_from_data_refs(soup: BeautifulSoup) str | None
 Extracts compound ID from urls to compound data
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 NIST compound ID, None if not detected
- Return type:
 _tp.Optional[str]
- nistchempy.parsing.compound.get_compound_id(soup: BeautifulSoup) str | None
 Checks if html is a single compound page and returns NIST compound ID if so
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 NIST compound ID for single compound webpage and None otherwise
- Return type:
 _tp.Optional[str]
- nistchempy.parsing.compound.get_compound_name(soup: BeautifulSoup) str
 Extracts chemical name from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 chemical name of a NIST compound
- Return type:
 str
- nistchempy.parsing.compound.get_compound_synonyms(soup: BeautifulSoup) List[str]
 Extracts synonyms of chemical name from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 list of alternative chemical names
- Return type:
 _tp.List[str]
- nistchempy.parsing.compound.get_compound_formula(soup: BeautifulSoup) str | None
 Extracts chemical formula from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 chemical formula, and None if not found
- Return type:
 _tp.Optional[str]
- nistchempy.parsing.compound.get_compound_mol_weight(soup: BeautifulSoup) float | None
 Extracts molecular weight from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 molecular weight, and None if not found
- Return type:
 _tp.Optional[float]
- nistchempy.parsing.compound.get_compound_inchi(soup: BeautifulSoup) str | None
 Extracts InChI from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 InChI string, and None if not found
- Return type:
 _tp.Optional[str]
- nistchempy.parsing.compound.get_compound_inchi_key(soup: BeautifulSoup) str | None
 Extracts InChI key from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 InChI key string, and None if not found
- Return type:
 _tp.Optional[str]
- nistchempy.parsing.compound.get_compound_casrn(soup: BeautifulSoup) str | None
 Extracts CAS registry number from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 CAS RN, and None if not found
- Return type:
 _tp.Optional[str]
- nistchempy.parsing.compound.get_compound_mol_refs(soup: BeautifulSoup) Dict[str, str]
 Extracts dictionary of URLs for compound MOL-files from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 mol2D / mol3D are keys, URLs are values
- Return type:
 _tp.Dict[str, str]
- nistchempy.parsing.compound.get_compound_data_refs(soup: BeautifulSoup) Dict[str, str]
 Extracts dictionary of URLs for compound properties from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 property names are keys, URLs are values
- Return type:
 _tp.Dict[str, str]
- nistchempy.parsing.compound.get_compound_nist_public_refs(soup: BeautifulSoup) Dict[str, str]
 Extracts dictionary of URLs for compound properties stored at other public NIST sites from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 property names are keys, URLs are values
- Return type:
 _tp.Dict[str, str]
- nistchempy.parsing.compound.get_compound_nist_subscription_refs(soup: BeautifulSoup) Dict[str, str]
 Extracts dictionary of URLs for compound properties stored at other subscription NIST sites from compound page
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 property names are keys, URLs are values
- Return type:
 _tp.Dict[str, str]
- nistchempy.parsing.compound.parse_compound_page(soup: BeautifulSoup) dict | None
 Parses Nist compound webpage and returns dictionary with extracted info
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 dictionary with extracted info and None if webpage does not correspond to single compound
- Return type:
 _tp.Optional[dict]
nistchempy.parsing.gas_chromatography
The module contains functionality to parse gas chromatography info
- nistchempy.parsing.gas_chromatography.get_chromatography_table_refs(soup: BeautifulSoup) List[str]
 Extracts references to large format tables containing info on chromatographic experiments
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 list of URLs
- Return type:
 _tp.List[str]
- nistchempy.parsing.gas_chromatography.get_literature_references(soup: BeautifulSoup) Dict[str, str]
 Extracts literature references from the corresponding section
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 ref’s span id => full reference text
- Return type:
 _tp.Dict
- nistchempy.parsing.gas_chromatography.parse_chromatography_table(soup: BeautifulSoup) dict
 Extracts references to large format tables containing info on chromatographic experiments
- Parameters:
 soup (_bs4.BeautifulSoup) – bs4-parsed web-page
- Returns:
 contains info to initialize nistchempy.compound.Chromatogram
- Return type:
 dict
nistchempy.utils
Utility functions
- nistchempy.utils.get_crawl_delay(useragent: str = '*', config: RequestConfig | None = None) float
 Returns NIST Chemistry Webbook’s crawl delay for the given user agent
- nistchempy.utils.useragent
 user agent
- Type:
 str
- Returns:
 crawl delay in seconds
- Return type:
 float