NistChemPy API
nistchempy
This package is a Python interface for the NIST Chemistry WebBook database that provides additional data for the efficient compound search and automatic retrievement of the stored physico-chemical data
- nistchempy.get_all_data() DataFrame
Returns pandas dataframe containing info on all NIST Chem WebBook compounds
- Returns:
dataframe containing pre-extracted compound info
- Return type:
_pd.core.frame.DataFrame
- nistchempy.get_compound(ID: str, **kwargs) NistCompound | None
Loads the main info on the given NIST compound
- Parameters:
ID (str) – NIST compound ID, CAS RN or InChI
kwargs – requests.get kwargs parameters
- Returns:
NistCompound object, and None if there are several compounds corresponding to the given ID
- Return type:
_tp.Optional[NistCompound]
- nistchempy.run_search(identifier: str, search_type: str, search_parameters: NistSearchParameters | None = None, use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False, **kwargs) NistSearch
Searches compounds in NIST Chemistry WebBook
- Parameters:
identifier (str) – NIST compound ID / formula / name / inchi / CAS RN
search_type (str) – identifier type, available options are: - ‘formula’ - ‘name’ - ‘inchi’ - ‘cas’ - ‘id’
search_parameters (_tp.Optional[NistSearchParameters]) – search parameters; if provided, the following search parameter arguments are ignored
use_SI (bool) – if True, returns results in SI units. otherwise calories are used
match_isotopes (bool) – if True, exactly matches the specified isotopes (formula search only)
allow_other (bool) – if True, allows elements not specified in formula (formula search only)
allow_extra (bool) – if True, allows more atoms of elements in formula than specified (formula search only)
no_ion (bool) – if True, excludes ions from the search (formula search only)
cTG (bool) – if True, returns entries containing gas-phase thermodynamic data
cTC (bool) – if True, returns entries containing condensed-phase thermodynamic data
cTP (bool) – if True, returns entries containing phase-change thermodynamic data
cTR (bool) – if True, returns entries containing reaction thermodynamic data
cIE (bool) – if True, returns entries containing ion energetics thermodynamic data
cIC (bool) – if True, returns entries containing ion cluster thermodynamic data
cIR (bool) – if True, returns entries containing IR data
cTZ (bool) – if True, returns entries containing THz IR data
cMS (bool) – if True, returns entries containing MS data
cUV (bool) – if True, returns entries containing UV/Vis data
cGC (bool) – if True, returns entries containing gas chromatography data
cES (bool) – if True, returns entries containing vibrational and electronic energy levels
cDI (bool) – if True, returns entries containing constants of diatomic molecules
cSO (bool) – if True, returns entries containing info on Henry’s law
kwargs – requests.get parameters
- Returns:
search object containing info on found compounds
- Return type:
NistSearch
- class nistchempy.NistSearchParameters(use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False)
Bases:
object
GET parameters for compound search of NIST Chemistry WebBook
- use_SI
if True, returns results in SI units. otherwise calories are used
- Type:
bool
- match_isotopes
if True, exactly matches the specified isotopes (formula search only)
- Type:
bool
- allow_other
if True, allows elements not specified in formula (formula search only)
- Type:
bool
- allow_extra
if True, allows more atoms of elements in formula than specified (formula search only)
- Type:
bool
- no_ion
if True, excludes ions from the search (formula search only)
- Type:
bool
- cTG
if True, returns entries containing gas-phase thermodynamic data
- Type:
bool
- cTC
if True, returns entries containing condensed-phase thermodynamic data
- Type:
bool
- cTP
if True, returns entries containing phase-change thermodynamic data
- Type:
bool
- cTR
if True, returns entries containing reaction thermodynamic data
- Type:
bool
- cIE
if True, returns entries containing ion energetics thermodynamic data
- Type:
bool
- cIC
if True, returns entries containing ion cluster thermodynamic data
- Type:
bool
- cIR
if True, returns entries containing IR data
- Type:
bool
- cTZ
if True, returns entries containing THz IR data
- Type:
bool
- cMS
if True, returns entries containing MS data
- Type:
bool
- cUV
if True, returns entries containing UV/Vis data
- Type:
bool
- cGC
if True, returns entries containing gas chromatography data
- Type:
bool
- cES
if True, returns entries containing vibrational and electronic energy levels
- Type:
bool
- cDI
if True, returns entries containing constants of diatomic molecules
- Type:
bool
- cSO
if True, returns entries containing info on Henry’s law
- Type:
bool
- use_SI: bool = True
- match_isotopes: bool = False
- allow_other: bool = False
- allow_extra: bool = False
- no_ion: bool = False
- cTG: bool = False
- cTC: bool = False
- cTP: bool = False
- cTR: bool = False
- cIE: bool = False
- cIC: bool = False
- cIR: bool = False
- cTZ: bool = False
- cMS: bool = False
- cUV: bool = False
- cGC: bool = False
- cES: bool = False
- cDI: bool = False
- cSO: bool = False
- get_request_parameters() dict
Returns dictionary containing GET parameters
- Returns:
dictionary of GET parameters relevant to the search
- Return type:
dict
- nistchempy.get_search_parameters() Dict[str, str]
Returns search parameters and the corresponding keys
- Returns:
{short_key => search_parameter}
- Return type:
_tp.Dict[str, str]
- nistchempy.print_search_parameters() None
Prints available search parameters
nistchempy.requests
Request wrappers for NIST Chemistry WebBook APIs
- nistchempy.requests.BASE_URL
base URL of the NIST Chemistry WebBook database
- Type:
str
- nistchempy.requests.SEARCH_URL
relative URL for the search API
- Type:
str
- nistchempy.requests.fix_html(html: str) str
Fixes detected typos in html code of NIST Chem WebBook web pages
- Parameters:
html (str) – text of html-file
- Returns:
fixed html-file
- Return type:
str
- class nistchempy.requests.NistResponse(response: Response)
Bases:
object
Describes response to the GET request to the NIST Chemistry WebBook
- response
request’s response
- Type:
_requests.models.Response
- ok
True if request’s status code is less than 400
- Type:
bool
- content_type
content type of the response
- Type:
_tp.Optional[str]
- text
text of the response
- Type:
_tp.Optional[str]
- soup
BeautifulSoup object of the html response
- Type:
_tp.Optional[_bs4.BeautifulSoup]
- response: Response
- ok: bool
- content_type: str | None
- text: str | None
- soup: BeautifulSoup | None = None
- nistchempy.requests.make_nist_request(url: str, params: dict = {}, **kwargs) NistResponse
Dummy request to the NIST Chemistry WebBook
- Parameters:
url (str) – URL of the NIST webpage
params (str) – GET request parameters
kwargs – requests.get kwargs parameters
- Returns:
wrapper for the request’s response
- Return type:
NistResponse
nistchempy.compound
The module contains compound-related functionality
- nistchempy.compound.SPEC_TYPES
dictionary containing abbreviations for spectra types used in compound page (keys) or urls for downloading JDX-files (values)
- Type:
dict
- nistchempy.compound.urlparse(url, scheme='', allow_fragments=True)
Parse a URL into 6 components: <scheme>://<netloc>/<path>;<params>?<query>#<fragment>
The result is a named 6-tuple with fields corresponding to the above. It is either a ParseResult or ParseResultBytes object, depending on the type of the url parameter.
The username, password, hostname, and port sub-components of netloc can also be accessed as attributes of the returned object.
The scheme argument provides the default value of the scheme component when no scheme is found in url.
If allow_fragments is False, no attempt is made to separate the fragment component from the previous component, which can be either path or query.
Note that % escapes are not expanded.
- nistchempy.compound.parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&')
Parse a query given as a string argument.
Arguments:
qs: percent-encoded query string to be parsed
- keep_blank_values: flag indicating whether blank values in
percent-encoded queries should be treated as blank strings. A true value indicates that blanks should be retained as blank strings. The default false value indicates that blank values are to be ignored and treated as if they were not included.
- strict_parsing: flag indicating what to do with parsing errors.
If false (the default), errors are silently ignored. If true, errors raise a ValueError exception.
- encoding and errors: specify how to decode percent-encoded sequences
into Unicode characters, as accepted by the bytes.decode() method.
- max_num_fields: int. If set, then throws a ValueError if there
are more than n fields read by parse_qsl().
- separator: str. The symbol to use for separating the query arguments.
Defaults to &.
Returns a dictionary.
- class nistchempy.compound.Spectrum(compound: NistCompound, spec_type: str, spec_idx: str, jdx_text: str)
Bases:
object
Wrapper for IR, MS, and UV-Vis extracted from NIST Chemistry WebBook
- compound
parent NistCompound object
- Type:
NistCompound
- spec_type
IR / TZ (THz IR) / MS / UV (UV-Vis)
- Type:
str
- spec_idx
index of the spectrum
- Type:
str
- jdx_text
text block of the corresponding JDX-file
- Type:
str
- compound: NistCompound
- spec_type: str
- spec_idx: str
- jdx_text: str
- save(name=None, path_dir=None)
Saves spectrum in JDX format
- class nistchempy.compound.NistCompound(ID: str | None, name: str | None, synonyms: List[str], formula: str | None, mol_weight: float | None, inchi: str | None, inchi_key: str | None, cas_rn: str | None, mol_refs: Dict[str, str], data_refs: Dict[str, str], nist_public_refs: Dict[str, str], nist_subscription_refs: Dict[str, str], nist_response: NistResponse)
Bases:
object
Stores info on NIST Chemistry WebBook compound
- ID
NIST compound ID
- Type:
_tp.Optional[str]
- name
chemical name
- Type:
_tp.Optional[str]
- synonyms
synonyms of the chemical name
- Type:
_tp.List[str]
- formula
chemical formula
- Type:
_tp.Optional[str]
- mol_weight
molecular weigth, g/cm^3
- Type:
_tp.Optional[float]
- inchi
InChI string
- Type:
_tp.Optional[str]
- inchi_key
InChI key string
- Type:
_tp.Optional[str]
- cas_rn
CAS registry number
- Type:
_tp.Optional[str]
- mol_refs
references to 2D and 3D MOL-files
- Type:
_tp.Dict[str, str]
- data_refs
references to the webpages containing physical chemical data for the given compound
- Type:
_tp.Dict[str, str]
- nist_public_refs
references to webpages of other public NIST databases containing data for the given compound
- Type:
_tp.Dict[str, str]
- nist_subscription_refs
references to webpages of subscription NIST databases containing data for the given compound
- Type:
_tp.Dict[str, str]
- nist_response
response to the GET request
- Type:
NistResponse
- mol2D
text block of a MOL-file containing 2D atomic coordinates
- Type:
_tp.Optional[str]
- mol3D
text block of a MOL-file containing 3D atomic coordinates
- Type:
_tp.Optional[str]
- ir_specs
list pf IR Spectrum objects
- Type:
_tp.List[Spectrum]
- thz_specs
list pf THz Spectrum objects
- Type:
_tp.List[Spectrum]
- ms_specs
list pf MS Spectrum objects
- Type:
_tp.List[Spectrum]
- uv_specs
list pf UV-Vis Spectrum objects
- Type:
_tp.List[Spectrum]
- ID: str | None
- name: str | None
- synonyms: List[str]
- formula: str | None
- mol_weight: float | None
- inchi: str | None
- inchi_key: str | None
- cas_rn: str | None
- mol_refs: Dict[str, str]
- data_refs: Dict[str, str]
- nist_public_refs: Dict[str, str]
- nist_subscription_refs: Dict[str, str]
- nist_response: NistResponse
- mol2D: str | None
- mol3D: str | None
- ir_specs: List[Spectrum]
- thz_specs: List[Spectrum]
- ms_specs: List[Spectrum]
- uv_specs: List[Spectrum]
- get_molfile(dim: int, **kwargs) None
Loads text block of 2D / 3D molfile
- Parameters:
dim (int) – dimensionality of molfile (2D / 3D)
kwargs – requests.get kwargs parameters
- get_mol2D(**kwargs) None
Loads text block of 2D molfile
- Parameters:
kwargs – requests.get kwargs parameters
- get_mol3D(**kwargs) None
Loads text block of 2D molfile
- Parameters:
kwargs – requests.get kwargs parameters
- get_molfiles(**kwargs) None
Loads text block of all available molfiles
- Parameters:
kwargs – requests.get kwargs parameters
- get_spectrum(spec_type: str, spec_idx: str) Spectrum
Loads spectrum of given type (IR / TZ / MS / UV) and index
- Parameters:
spec_type (str) – spectrum type [ IR / TZ / MS / UV ]
spec_idx (str) – spectrum index
- Returns:
wrapper for the text block of JDX-formatted spectrum
- Return type:
Spectrum
- get_spectra(spec_type: str) None
Loads all available spectra of given type (IR / TZ / MS / UV)
- Parameters:
spec_type (str) – spectrum type [ IR / TZ / MS / UV ]
- get_ir_spectra()
Loads all available IR spectra
- get_thz_spectra()
Loads all available THz spectra
- get_ms_spectra()
Loads all available MS spectra
- get_uv_spectra()
Loads all available UV-Vis spectra
- get_all_spectra()
Loads all available spectra
- save_spectra(spec_type, path_dir='./') None
Saves all spectra of given type to the specified folder
- Parameters:
spec_type (str) – spectrum type [ IR / TZ / MS / UV ]
path_dir (str) – directory to save spectra
- save_ir_spectra(path_dir='./') None
Saves IR spectra to the specified folder
- save_thz_spectra(path_dir='./') None
Saves IR spectra to the specified folder
- save_ms_spectra(path_dir='./') None
Saves mass spectra to the specified folder
- save_uv_spectra(path_dir='./') None
Saves all UV-Vis spectra to the specified folder
- save_all_spectra(path_dir='./') None
Saves all UV-Vis spectra to the specified folder
- nistchempy.compound.compound_from_response(nr: NistResponse) NistCompound | None
Initializes NistCompound object from the corresponding response
- Parameters:
nr (_ncpr.NistResponse) – response to the GET request for a compound
- Returns:
NistCompound object, and None if there are several compounds corresponding to the given ID
- Return type:
_tp.Optional[NistCompound]
- nistchempy.compound.get_compound(ID: str, **kwargs) NistCompound | None
Loads the main info on the given NIST compound
- Parameters:
ID (str) – NIST compound ID, CAS RN or InChI
kwargs – requests.get kwargs parameters
- Returns:
NistCompound object, and None if there are several compounds corresponding to the given ID
- Return type:
_tp.Optional[NistCompound]
nistchempy.search
The module contains search-related functionality
- nistchempy.search.get_search_parameters() Dict[str, str]
Returns search parameters and the corresponding keys
- Returns:
{short_key => search_parameter}
- Return type:
_tp.Dict[str, str]
- nistchempy.search.print_search_parameters() None
Prints available search parameters
- class nistchempy.search.NistSearchParameters(use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False)
Bases:
object
GET parameters for compound search of NIST Chemistry WebBook
- use_SI
if True, returns results in SI units. otherwise calories are used
- Type:
bool
- match_isotopes
if True, exactly matches the specified isotopes (formula search only)
- Type:
bool
- allow_other
if True, allows elements not specified in formula (formula search only)
- Type:
bool
- allow_extra
if True, allows more atoms of elements in formula than specified (formula search only)
- Type:
bool
- no_ion
if True, excludes ions from the search (formula search only)
- Type:
bool
- cTG
if True, returns entries containing gas-phase thermodynamic data
- Type:
bool
- cTC
if True, returns entries containing condensed-phase thermodynamic data
- Type:
bool
- cTP
if True, returns entries containing phase-change thermodynamic data
- Type:
bool
- cTR
if True, returns entries containing reaction thermodynamic data
- Type:
bool
- cIE
if True, returns entries containing ion energetics thermodynamic data
- Type:
bool
- cIC
if True, returns entries containing ion cluster thermodynamic data
- Type:
bool
- cIR
if True, returns entries containing IR data
- Type:
bool
- cTZ
if True, returns entries containing THz IR data
- Type:
bool
- cMS
if True, returns entries containing MS data
- Type:
bool
- cUV
if True, returns entries containing UV/Vis data
- Type:
bool
- cGC
if True, returns entries containing gas chromatography data
- Type:
bool
- cES
if True, returns entries containing vibrational and electronic energy levels
- Type:
bool
- cDI
if True, returns entries containing constants of diatomic molecules
- Type:
bool
- cSO
if True, returns entries containing info on Henry’s law
- Type:
bool
- use_SI: bool = True
- match_isotopes: bool = False
- allow_other: bool = False
- allow_extra: bool = False
- no_ion: bool = False
- cTG: bool = False
- cTC: bool = False
- cTP: bool = False
- cTR: bool = False
- cIE: bool = False
- cIC: bool = False
- cIR: bool = False
- cTZ: bool = False
- cMS: bool = False
- cUV: bool = False
- cGC: bool = False
- cES: bool = False
- cDI: bool = False
- cSO: bool = False
- get_request_parameters() dict
Returns dictionary containing GET parameters
- Returns:
dictionary of GET parameters relevant to the search
- Return type:
dict
- class nistchempy.search.NistSearch(nist_response: NistResponse, search_parameters: NistSearchParameters, compound_ids: List[str], success: bool, lost: bool)
Bases:
object
Results of the compound search in NIST Chemistry WebBook
- nist_response
NIST search response
- Type:
NistResponse
- search_parameters
used search parameters
- Type:
NistSearchParameters
- compound_ids
NIST IDs of found compounds
- Type:
_tp.List[str]
- compounds
NistCompound objects of found compounds
- Type:
_tp.List[_compound.NistCompound]
- success
True if search request was successful
- Type:
bool
- num_compounds
number of found compounds
- Type:
int
- lost
True if search returns less compounds than there are in the database
- Type:
bool
- nist_response: NistResponse
- search_parameters: NistSearchParameters
- compound_ids: List[str]
- compounds: List[NistCompound]
- success: bool
- num_compounds: int
- lost: bool
- load_found_compounds(**kwargs) None
Loads found compounds
- Parameters:
kwargs – requests.get kwargs parameters
- nistchempy.search.run_search(identifier: str, search_type: str, search_parameters: NistSearchParameters | None = None, use_SI: bool = True, match_isotopes: bool = False, allow_other: bool = False, allow_extra: bool = False, no_ion: bool = False, cTG: bool = False, cTC: bool = False, cTP: bool = False, cTR: bool = False, cIE: bool = False, cIC: bool = False, cIR: bool = False, cTZ: bool = False, cMS: bool = False, cUV: bool = False, cGC: bool = False, cES: bool = False, cDI: bool = False, cSO: bool = False, **kwargs) NistSearch
Searches compounds in NIST Chemistry WebBook
- Parameters:
identifier (str) – NIST compound ID / formula / name / inchi / CAS RN
search_type (str) – identifier type, available options are: - ‘formula’ - ‘name’ - ‘inchi’ - ‘cas’ - ‘id’
search_parameters (_tp.Optional[NistSearchParameters]) – search parameters; if provided, the following search parameter arguments are ignored
use_SI (bool) – if True, returns results in SI units. otherwise calories are used
match_isotopes (bool) – if True, exactly matches the specified isotopes (formula search only)
allow_other (bool) – if True, allows elements not specified in formula (formula search only)
allow_extra (bool) – if True, allows more atoms of elements in formula than specified (formula search only)
no_ion (bool) – if True, excludes ions from the search (formula search only)
cTG (bool) – if True, returns entries containing gas-phase thermodynamic data
cTC (bool) – if True, returns entries containing condensed-phase thermodynamic data
cTP (bool) – if True, returns entries containing phase-change thermodynamic data
cTR (bool) – if True, returns entries containing reaction thermodynamic data
cIE (bool) – if True, returns entries containing ion energetics thermodynamic data
cIC (bool) – if True, returns entries containing ion cluster thermodynamic data
cIR (bool) – if True, returns entries containing IR data
cTZ (bool) – if True, returns entries containing THz IR data
cMS (bool) – if True, returns entries containing MS data
cUV (bool) – if True, returns entries containing UV/Vis data
cGC (bool) – if True, returns entries containing gas chromatography data
cES (bool) – if True, returns entries containing vibrational and electronic energy levels
cDI (bool) – if True, returns entries containing constants of diatomic molecules
cSO (bool) – if True, returns entries containing info on Henry’s law
kwargs – requests.get parameters
- Returns:
search object containing info on found compounds
- Return type:
NistSearch
nistchempy.compound_list
Loads pre-prepared info on compounds structure and data availability
- nistchempy.compound_list.get_all_data() DataFrame
Returns pandas dataframe containing info on all NIST Chem WebBook compounds
- Returns:
dataframe containing pre-extracted compound info
- Return type:
_pd.core.frame.DataFrame