SPARQL-based Reconciliation

For detailed example and screenshots see Reconciling countries against DBpedia

Reconciliation can be based on standard SPARQL using regular expression comparison. This method is generally limited and shows poor performance on large datasets as current SPARQL implementations are slow when it comes to evaluating regular expression queries on large datasets.

As SPARQL results are not ranked, even exact matches are not guaranteed to be included in the result when using regular expression comparison and limiting the number of returned results. Hence, SPARQL-based reconciliation service starts with a string comparison of label (i.e. exact match) and only if no result is found tries regular expression comparison.

Type and property autocompletion are also supported using regular expression comparison with types and properties labels (represented using rdfs:label or skos:prefLabel)

Example SPARQL queries

Exact match comparison with labeling property is: rdfs:label
Input: label="Galway"
    
	  SELECT ?entity 
	  WHERE{
		?entity <http://www.w3.org/2000/01/rdf-schema#label> ?label. 
		FILTER (str(?label) = 'Galway'). 
		FILTER isIRI(?entity). 
	  } LIMIT 3;
    		
  
Regualr expression comparison with labeling property is: rdfs:label
Input: label="Galway"
    
	  SELECT ?entity ?label1
	  WHERE{
		?entity <http://www.w3.org/2000/01/rdf-schema#label> ?label1. 
		FILTER regex(str(?label1),'Galway','i'). 
		FILTER isIRI(?entity). 
	  } LIMIT 3
    		
  
Regualr expression comparison with labeling properties are: rdfs:label and dcterms:title
Input: label="Galway"
    
	  SELECT ?entity ?label1 ?label2 
	  WHERE{
		{
		  OPTIONAL{ 
			?entity <http://www.w3.org/2000/01/rdf-schema#label> ?label1. 
			FILTER regex(str(?label1),'Galway','i')
		  }
		  OPTIONAL{ 
			?entity <http://purl.org/dc/terms/title> ?label2. 
			FILTER regex(str(?label2),'Galway','i')
		  }
		  FILTER ( bound(?label1) || bound(?label2))
		}
		FILTER isIRI(?entity). 
	  } LIMIT 3
    		
  
Regualr expression comparison with labeling properties are: rdfs:label and dcterms:title and type is restricted to either dbo:PopulatedPlace or yago:Locations
Input: label="Galway", type= {"<http://dbpedia.org/ontology/PopulatedPlace>","<http://dbpedia.org/class/yago/Locations>"}, related properties ={("<http://dbpedia.org/property/subdivisionName>","Ireland")}
    
	  SELECT ?entity ?label1
	  WHERE{
		{
		  OPTIONAL{ 
			?entity <http://www.w3.org/2000/01/rdf-schema#label> ?label1.
			FILTER regex(str(?label1),'Galway','i')
		  }
		  OPTIONAL{ 
			?entity <http://purl.org/dc/terms/title> ?label2. 
			FILTER regex(str(?label2),'Galway','i')
		  }
		  FILTER ( bound(?label1) || bound(?label2))
		}
		{
		  {?entity rdf:type <http://dbpedia.org/ontology/PopulatedPlace>. } 
		  UNION 
		  {?entity rdf:type <http://dbpedia.org/class/yago/Locations>. } 
		}
		?entity <http://dbpedia.org/property/subdivisionName> 'Ireland'. 
		FILTER isIRI(?entity). 
	  } LIMIT 3
    		
  
Type autocompletion regualr expression search
Input: "perso"
    
      SELECT DISTINCT ?type ?label 
	  WHERE{
		[] a ?type. 
		?type ?p ?label. 
		FILTER (?p=<http://www.w3.org/2000/01/rdf-schema#label> || 
                ?p=<http://www.w3.org/2004/02/skos/core#prefLabel>). 
		FILTER regex(str(?label),'^perso','i')
	  } LIMIT 10