Chemistry:Cheminformatics toolkits

From HandWiki

Cheminformatics toolkits are software development kits that allow cheminformaticians to develop custom computer applications for use in virtual screening, chemical database mining, and structure-activity studies.[1][2] Toolkits are often used for experimentation with new methodologies. Their most important functions deal with the manipulation of chemical structures and comparisons between structures. Programmatic access is provided to properties of individual bonds and atoms.

Functionality

Toolkits provide the following functionality:

  • Read and save structures in various chemistry file formats.
  • Determine if one structure is a substructure of another (substructure matching).
  • Determine if two structures are equal (exact matching).
  • Identification of substructures common to structures in a set (maximal common substructure, MCS).
  • Disassemble molecules, splitting into fragments.
  • Assemble molecules from elements or submolecules.
  • Apply reactions on input reactant structures, resulting in output of reaction product structures.
  • Generate molecular fingerprints. Fingerprints are bit-vectors where individual bits correspond to the presence or absence of structural features. The most important use of fingerprints is in indexing of chemistry databases.

List of notable cheminformatics toolkits

Name License APIs Home Page Notes
CDK Open source Java https://cdk.github.io/ [3][4][5]
ChemmineR Open source R, C++ http://manuals.bioinformatics.ucr.edu/home/chemminer [6][7]
Enalos KNIME nodes Open source KNIME http://tech.knime.org/community/enalos-nodes [8]
Enalos+ KNIME nodes Proprietary KNIME http://enalosplus.novamechanics.com/ [9]
Indigo Open source C, C#, Java, Python http://lifescience.opensource.epam.com/indigo
MolEngine Proprietary .NET https://www.scilligence.com/web/scilligence-regmol/
Molecular Operating Environment (MOE) Proprietary Scientific Vector Language https://web.archive.org/web/20160909172415/http://www.chemcomp.com/MOE-Cheminformatics_and_QSAR.htm
OpenBabel Open source C, Python, Ruby http://openbabel.org/ ,[10][11]
Helium Open source C++ https://web.archive.org/web/20140407082845/http://www.moldb.net/helium.html
RDKit Open source Python, C++, Java, Knime http://www.rdkit.org/
Rcpi Open source R https://bioconductor.org/packages/Rcpi [12]
frowns Open source Python http://frowns.sourceforge.net/
OUCH Open source Haskell http://www.pharmash.com/posts/2010-08-02-ouch.html
chemf Open source Scala https://github.com/stefan-hoeck/chemf
3D-e-Chem Open source Python, Java, Knime https://3d-e-chem.github.io/ [13][14]
SMSD Creative Commons Attribution Java http://www.ebi.ac.uk/thornton-srv/software/SMSD/ [15]
Accord SDK Proprietary VBA, .NET, PL/SQL http://accelrys.com/products/datasheets/accord-software-development-kit.pdf
CACTVS Proprietary, free for academia, personal use, public web services Tcl, C, C++, Python, Knime http://www.xemistry.com/academic
Daylight Proprietary C, C++, Java, Fortran http://www.daylight.com/products/toolkit.html
OEChem Proprietary, free for academia C++, Python, C#, Java http://eyesopen.com/
Marvin, JChem Proprietary, free for academia Java, .NET, Javascript http://www.chemaxon.com
ChemDoodle API Proprietary Java, Javascript http://www.ichemlabs.com
PerlMol Open source Perl https://web.archive.org/web/20120315121757/http://www.perlmol.org/
ADMET Predictor, MedChem Studio, MedChem Designer Proprietary, free for qualifying academics C++, KNIME, Pipeline Pilot http://www.simulations-plus.com
CDD Vault Proprietary, free for CDD Public read-only data CDD Vault https://www.collaborativedrug.com/cdd-vault [16]
MolecularGraph.jl MIT License Julia https://github.com/mojaie/MolecularGraph.jl

References

  1. Jean-Loup Faulon; Andreas Bender (April 2010). Handbook of Chemoinformatics Algorithms. Chapman & Hall. ISBN 978-1420082920. 
  2. Johann Gasteiger (November 2003). Chemoinformatics. Wiley-VCH. ISBN 3527306811. https://archive.org/details/isbn_9783527306817_0. 
  3. Steinbeck C, C.; Han Y; Kuhn S; Horlacher O; Luttmann E; Willighagen E (2003). "The Chemistry Development Kit". J Chemical Inf. Comput. Sci. 43 (2): 493–500. doi:10.1021/ci025584y. PMID 12653513. 
  4. Steinbeck C., Christoph; Hoppe C.; Kuhn S.; Floris M.; Guha R.; Willighagen E.L. (2006). "Recent Developments of the Chemistry Development Kit (CDK) - An Open-Source Java Library for Chemo- and Bioinformatics". Curr. Pharm. Des. 12 (17): 2111–2120. doi:10.2174/138161206777585274. PMID 16796559. 
  5. Willighagen, Egon L.; Mayfield, John W.; Alvarsson, Jonathan; Berg, Arvid; Carlsson, Lars; Jeliazkova, Nina; Kuhn, Stefan; Pluskal, Tomáš et al. (December 2017). "The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching". Journal of Cheminformatics 9 (1): 33. doi:10.1186/s13321-017-0220-4. PMID 29086040. 
  6. Cao, Y; Charisi, A; Cheng, LC; Jiang, T; Girke, T (2008). "ChemmineR: A Compound Mining Framework for R.". Bioinformatics 24 (15): 1733–1734. doi:10.1093/bioinformatics/btn307. PMID 18596077. 
  7. Wang, Y; Backman, TW; Horan, K; Girke, T (2013). "fmcsR: Mismatch Tolerant Maximum Common Substructure Searching in R.". Bioinformatics 29 (21): 2792–4. doi:10.1093/bioinformatics/btt475. PMID 23962615. 
  8. Requires Knime (http://www.knime.org/)
  9. Requires KNIME (http://www.knime.org/)
  10. reads and writes all chemical file formats.
  11. O’Boyle N; Banck M; James C; Morley C; Vandermeersch T; Hutchison G (2011). "Open babel: an open chemical". Journal of Cheminformatics 3 (33): 33. doi:10.1186/1758-2946-3-33. PMID 21982300. 
  12. "Rcpi: R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions". Bioinformatics 31 (2): 279–281. Jan 2015. doi:10.1093/bioinformatics/btu624. PMID 25246429. 
  13. "3D-e-Chem-VM: Structural cheminformatics research infrastructure in a freely available Virtual Machine". J. Chem. Inf. Model. 57 (2): 115–121. 2017. doi:10.1021/acs.jcim.6b00686. PMID 28125221. 
  14. "3D-e-Chem: Structural Cheminformatics Workflows for Computer-Aided Drug Discovery". ChemMedChem 13 (6): 614–626. 2018. doi:10.1002/cmdc.201700754. PMID 29337438. 
  15. S. Asad Rahman, Syed; M. Bashton; G. L. Holliday; R. Schrader; J. M. Thornton (2009). "Small Molecule Subgraph Detector (SMSD) Toolkit". Journal of Cheminformatics 1 (12): 12. doi:10.1186/1758-2946-1-12. PMID 20298518. 
  16. Novel web-based tools combining chemistry informatics, biology and social networks for drug discovery. Hohman M, Gregory K, Chibale K, Smith PJ, Ekins S, Bunin B Drug Discov Today. 2009 Mar;14(5-6):261-70.