Drug and Vaccine Discovery - Knowledge Graph and Apache Spark
by Pepe García
- •
- May 28, 2021
- •
- events• talks
Recently, a new open-source library sponsored by GSK and co-developed by 47 Degrees, was featured at the DATA + AI SUMMIT 2021.
Bellman is designed to run Sparql queries on Apache Spark Datasets. You can now watch the talk on-demand for free and view the slides below:
Data+AI Summit 2021
Drug and Vaccine Discovery: Knowledge Graph + Apache Spark
RDF, Knowledge Graphs, and ontologies enable companies to produce and consume graph data that is interoperable, sharable, and self-describing. GSK has set out to build the world’s largest medical knowledge graph to provide our scientists access to the world’s medical knowledge, also enable machine learning to infer links between facts.
These inferred links are the heart of gene to disease mapping and is the future of discovering new treatments and vaccines. To power RDF sub-graphing, GSK has developed a set of open-source libraries codenamed “Project Bellman” that enable Sparql queries over partitioned RDF data in Apache Spark.
These tools provide the ability to scale up to Sparql querying over trillions of RDF triples, provide point-in-time queries, and provide incremental data updates to downstream consumer applications. These tools are used by both GSK’s Ai/ML team to discover gene to disease mappings, and GSK’s scientists to query over the world’s medical knowledge.
This talk was given by GlaxoSmithKline director John Hunter at the Data+AI Summit 2021.