Overview
Here are my notes on how to count the number of triples in an RDF store.
This time, we will use the Japan Search RDF store as an example.
https://jpsearch.go.jp/rdf/sparql/easy/
Number of Triples
The following query counts the number of triples:
SELECT (COUNT(*) AS ?NumberOfTriples)
WHERE {
?s ?p ?o .
}
The result is:
At the time of writing this article (May 6, 2024), there were 1,280,645,565 triples (approximately 1.28 billion).
NumberOfTriples
1280645565
How Many Triples Are Connected by a Specific Property
Next, let’s count how many triples are connected by a specific property. Here is an example query:
SELECT ?p (COUNT(*) AS ?count)
WHERE {
?s ?p ?o .
}
GROUP BY ?p
ORDER BY DESC(?count)
The result is:
We can see that there are 399,447,925 triples (approximately 400 million) connected by schema:description.
| p | count |
|---|---|
| schema:description | 399447925 |
| rdf:type | 84363276 |
| jps:relationType | 72908233 |
| jps:value | 72214780 |
| schema:name | 57377225 |
| schema:provider | 52481873 |
Counting Combinations of Subject and Object Types for a Specific Property
To understand the overview of the above triples, let’s count the combinations of subject types and object types that are linked by the schema:description property.
SELECT ?subjectType ?objectType (COUNT(*) AS ?count)
WHERE {
?subject schema:description ?object .
?subject rdf:type ?subjectType .
optional {?object rdf:type ?objectType . }
}
GROUP BY ?subjectType ?objectType
ORDER BY DESC(?count)
The result is:
There were approximately 90,000 triples with subjects that are instances of the type:図書 (Book) class.
| subjectType | objectType | count |
|---|---|---|
| type:図書 | 87593194 | |
| type:動物標本 | 47068657 | |
| type:植物標本 | 46548944 | |
| type:アクセス情報 | 33291083 | |
| type:雑誌 | 21643930 | |
| type:行政文書 | 11780814 |
Counting Instances
The following query counts the combinations of triples that have instances of the type:図書 (Book) class as subjects and schema:description as the property.
SELECT ?subject (COUNT(*) AS ?count)
WHERE {
?subject schema:description ?object .
?subject rdf:type type:図書 .
}
GROUP BY ?subject
ORDER BY desc(?count)
The result is:
We can see that a single instance has many triples connected by schema:description.
By actually accessing the following, we can confirm that it indeed has 251 triples.
https://jpsearch.go.jp/data/bibnl-20601759
Summary
I hope this serves as a useful reference for analyzing the number of triples in an RDF store.