Motivation: Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. In the past years, feature learning methods that are applicable to graph-structured data are becoming available, but have not yet widely been applied and evaluated on structured biological knowledge. Results: We develop a novel method for feature learning on biological knowledge graphs. Our method combines symbolic methods, in particular knowledge representation using symbolic logic and automated reasoning, with neural networks to generate embeddings of nodes that encode for related information within knowledge graphs. Through the use of symbolic logic, these embeddings contain both explicit and implicit information. We apply these embeddings to the prediction of edges in the knowledge graph representing problems of function prediction, finding candidate genes of diseases, protein-protein interactions, or drug target relations, and demonstrate performance that matches and sometimes outperforms traditional approaches based on manually crafted features. Our method can be applied to any biological knowledge graph, and will thereby open up the increasing amount of Semantic Web based knowledge bases in biology to use in machine learning and data analytics.Availability and implementation: https://github.com/bio-ontology-research-group/walking-rdf-and-owl.Contact: robert.hoehndorf@kaust.edu.sa.Supplementary information: Supplementary data are available at Bioinformatics online.
The Protein Data Bank Japan (PDBj, http://pdbj.org), a member of the worldwide Protein Data Bank (ww- PDB), accepts and processes the deposited data of experimentally determined macromolecular structures. While maintaining the archive in collaboration with other wwPDB partners, PDBj also provides a wide range of services and tools for analyzing structures and functions of proteins.We herein outline the updated web user interfaces together with RESTful web services and the backend relational database that support the former. To enhance the interoperability of the PDB data, we have previously developed PDB/RDF, PDB data in the Resource Description Framework (RDF) format, which is now a wwPDB standard called wwPDB/RDF. We have enhanced the connectivity of the wwPDB/RDF data by incorporating various external data resources. Services for searching, comparing and analyzing the ever-increasing large structures determined by hybrid methods are also described.
Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein–protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts’ knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/.
We have developed a new platform-independent web-based molecular viewer using JavaScript and WebGL. The molecular viewer, Molmil, has been integrated into several services offered by Protein Data Bank Japan and can be easily extended with new functionality by third party developers. Furthermore, the viewer can be used to load files in various formats from the user’s local hard drive without uploading the data to a server. Molmil is available for all platforms supporting WebGL (e.g. Windows, Linux, iOS, Android) from http://gjbekker.github.io/molmil/. The source code is available at http://github.com/gjbekker/molmil under the LGPLv3 licence.
As the output from the solar PV systems varies significantly with technologies, designs and prevailing weather parameters, their evaluation under actual field conditions is important in identifying their real performance characteristics. In this paper, comparative performances of six different PV systems connected to a 1.2 MWp grid integrated solar farm are presented. The solar technologies considered are the single crystalline (sc-Si), poly crystalline (mc-Si), micro crystalline (nc-Si/a-Si), amorphous silicon (a-Si), Copper Indium Selenium (CIS) and Heterojunction with Intrinsic Thin Layer (HIT). Following the IEA guidelines, the array yield, capture losses, array efficiency ratio, and performance ratio were taken as the criteria for the performance comparison. Among the six different module types, the systems based on amorphous silicon and HIT offer the best performance under the tropical environment considered.
The Protein Data Bank Japan (PDBj, http://pdbj.org) is a member of the worldwide Protein Data Bank (wwPDB) and accepts and processes the deposited data of experimentally determined macromolecular structures. While maintaining the archive in collaboration with other wwPDB partners, PDBj also provides a wide range of services and tools for analyzing structures and functions of proteins, which are summarized in this article. To enhance the interoperability of the PDB data, we have recently developed PDB/RDF, PDB data in the Resource Description Framework (RDF) format, along with its ontology in the Web Ontology Language (OWL) based on the PDB mmCIF Exchange Dictionary. Being in the standard format for the Semantic Web, the PDB/RDF data provide a means to integrate the PDB with other biological information resources.