Linking Pathways to Biosynthetic Gene Clusters

Tutorial on SPARQL for PlantMetWiki


Plant specialized metabolites are often produced by biosynthetic gene clusters (BGCs) — groups of physically co-located genes that together encode a metabolic pathway.

PlantMetWiki provides explicit cross-links between metabolic pathways and BGC resources, allowing you to move from:

  • pathway-level knowledge
  • to genomic context
  • to specialized metabolite biosynthesis

In this section, we explore how PlantMetWiki connects pathways to:

  • plantiSMASH predictions
  • MIBiG curated BGCs
  • external metadata describing gene clusters

SPARQL endpoint
https://plantmetwiki.bioinformatics.nl/sparql

Graph used in all queries

FROM <http://plantmetwiki.bioinformatics.nl/>

A BGC cross-link connects:

•	a PlantMetWiki pathway
•	to a gene cluster identifier
•	originating from an external resource

These links are derived from:

•	pathway annotations
•	genomic metadata
•	curated and predicted BGC databases

PlantMetWiki does not duplicate BGC data; instead, it acts as a hub connecting pathways to specialized genomics resources.

To get an overview of how many pathway–BGC links exist, you can list all known cross-links:

SELECT ?pathway ?bgc
FROM <http://plantmetwiki.bioinformatics.nl/>
WHERE {
  ?pathway ?predicate ?bgc .
  FILTER(CONTAINS(STR(?bgc), "BGC"))
}
LIMIT 200

This query reveals that: • pathways may link to multiple BGCs • BGC identifiers come from different external sources

Linking pathways to plantiSMASH BGCs

plantiSMASH predicts biosynthetic gene clusters directly from plant genomes.

PlantMetWiki pathways can link to plantiSMASH BGC identifiers, allowing you to: • move from pathway → genome • inspect candidate gene clusters • evaluate biosynthetic hypotheses

Example query (from plantiSMASHLinks.rq):

SELECT ?pathway ?plantiSMASH_BGC
FROM <http://plantmetwiki.bioinformatics.nl/>
WHERE {
  ?pathway ?p ?plantiSMASH_BGC .
  FILTER(CONTAINS(STR(?plantiSMASH_BGC), "plantiSMASH"))
}
LIMIT 200

Each returned BGC identifier can be clicked through to explore: • predicted cluster boundaries • gene annotations • domain architecture

Linking pathways to curated MIBiG clusters

MIBiG is a manually curated database of experimentally validated biosynthetic gene clusters.

PlantMetWiki links pathways to MIBiG entries when: • a pathway is supported by experimental evidence • a known BGC has been described in the literature

Example query (from MIBiGLinks.rq):

SELECT ?pathway ?mibig
FROM <http://plantmetwiki.bioinformatics.nl/>
WHERE {
  ?pathway ?p ?mibig .
  FILTER(CONTAINS(STR(?mibig), "mibig"))
}
LIMIT 200

These links allow you to: • trace pathways to experimentally validated gene clusters • connect pathway knowledge with publications • assess confidence in biosynthetic assignments

Combining multiple BGC sources

Some pathways link to both predicted and curated clusters.

You can retrieve all BGC-related links regardless of source:

SELECT ?pathway ?bgc
FROM <http://plantmetwiki.bioinformatics.nl/>
WHERE {
  ?pathway ?p ?bgc .
  FILTER(
    CONTAINS(STR(?bgc), "plantiSMASH") ||
    CONTAINS(STR(?bgc), "mibig")
  )
}
LIMIT 200

This makes it possible to: • compare predictions with curated knowledge • identify gaps in experimental validation • prioritize clusters for follow-up study

Pathway-centric view: BGCs for a specific pathway

You can also start from a specific MIBiG BGC and ask what is the pathway that belongs to that BGC

PREFIX ro: <http://purl.obolibrary.org/obo/RO_>
PREFIX wp:  <http://vocabularies.wikipathways.org/wp#>
PREFIX dc:  <http://purl.org/dc/elements/1.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>

# Retrieve thalianol pathway 
SELECT DISTINCT ?pw (STR(?titleLit) AS ?title)
FROM <http://plantmetwiki.bioinformatics.nl/>
WHERE {
  <https://bioregistry.io/mibig:BGC0000670> ro:0000051 ?gene .

  ?interaction wp:participants ?gene ;
              dcterms:isPartOf ?pw .

  ?pw dc:title ?titleLit .
}
ORDER BY ?title

By linking pathways to gene clusters, PlantMetWiki enables: • genome-to-metabolite reasoning • discovery of candidate biosynthetic loci • comparison of predicted vs curated clusters • integration with omics pipelines

This makes PlantMetWiki especially useful for: • plant specialized metabolism research • natural product discovery • functional genomics • comparative pathway analysis