Difference between revisions of "Small SPARQL, RDQL, etc. Cheat Sheet"

Revision as of 04:18, 30 August 2012

For the lack of the same, I'll put here some of my notes on SPARQL, RDQL, graph databases, and semantic web related topics in general... Will probably branch out to several pages in future, but for now, it's just a small mess.

Introduction

I'm using Redland 1.0.13, Raptor 2.0.4, and Rasqal 0.9.26 as reference implementation of SPARQL 1.0, SPARQL 1.1, and RDQL.

Basic Observations

Main rule of thumb I observed in many systems - try to guess what statement of the WHERE clause restricts the triplets set the most, and order the statements in increasing order of generality (most restrictive first).

For example, lets find all items that "user X" bought, that are blue. Lets presume that there are many more blue items in the DB than items that "user X" bought.

Then the query (get all things that are blue, that "user X" bought):

SELECT ?thing WHERE { ?thing _:color "blue" . "user X" _:bought ?thing }

will typically run (much) slower, than (get all things that "user X" bought, that are blue):

SELECT ?thing WHERE { "user X" _:bought ?thing . ?thing _:color "blue" }

Note that the result set is identical, but the former query first takes all the blue things and picks those bought by "user X", while the latter takes the small set of bought items and picks just the blue ones.

In general - graph databases are incredibly powerful tools, but it's up to you to make them smart!

@@ Line 7: / Line 7: @@
 == Basic Observations ==
-Main rule of thumb I observed in many systems - fixing object ("RHS") of the triplet is much cheaper than fixing subject ("LHS"), sometimes in the order of 3-4 magnitudes.
+Main rule of thumb I observed in many systems - try to guess what statement of the ''WHERE'' clause restricts the triplets set the most, and order the statements in increasing order of generality (most restrictive first).
-In other words - having only triplets
+For example, lets find all items that "user X" bought, that are blue. Lets presume that there are many more blue items in the DB than items that "user X" bought.
- ''something'' contains ''a-thing''
-and querying for "all things that given fixed ''something'' contains"
+Then the query (get all things that are blue, that "user X" bought):
-  SELECT ?thing WHERE { ''something'' contains ?thing }
+  SELECT ?thing WHERE { ?thing _:color "blue" . "user X" _:bought ?thing }
-might take substantially longer than storing also
+will typically run (much) slower, than  (get all things that "user X" bought, that are blue):
-  ''a-thing'' contained-in ''something''
+  SELECT ?thing WHERE { "user X" _:bought ?thing . ?thing _:color "blue" }
-triplets along, and querying for "all things contained in given fixed ''something''"
-  SELECT ?thing WHERE { ?thing contained-in ''something'' }
+Note that the result set is identical, but the former query first takes all the blue things and picks those bought by "user X", while the latter takes the small set of bought items and picks just the blue ones.
+In general - '''graph databases are incredibly powerful tools, but it's up to you to make them smart!'''

Difference between revisions of "Small SPARQL, RDQL, etc. Cheat Sheet"

Revision as of 04:18, 30 August 2012

Introduction

Basic Observations

Navigation menu

Views

Personal tools

Navigation

Search

Tools