Overview

Here are my notes on how to count the number of triples in an RDF store.

This time, we will use the Japan Search RDF store as an example.

https://jpsearch.go.jp/rdf/sparql/easy/

Number of Triples

The following query counts the number of triples:

SELECT (COUNT(*) AS ?NumberOfTriples)
WHERE {
  ?s ?p ?o .
}

The result is:

https://jpsearch.go.jp/rdf/sparql/easy/?query=SELECT+(COUNT(*)+AS+%3FNumberOfTriples) WHERE+{ ++%3Fs+%3Fp+%3Fo+. }

At the time of writing this article (May 6, 2024), there were 1,280,645,565 triples (approximately 1.28 billion).

NumberOfTriples

1280645565

How Many Triples Are Connected by a Specific Property

Next, let’s count how many triples are connected by a specific property. Here is an example query:

SELECT ?p (COUNT(*) AS ?count)
WHERE {
  ?s ?p ?o .
}
GROUP BY ?p
ORDER BY DESC(?count)

The result is:

https://jpsearch.go.jp/rdf/sparql/easy/?query=SELECT+%3Fp+(COUNT(*)+AS+%3Fcount) WHERE+{ ++%3Fs+%3Fp+%3Fo+. } GROUP+BY+%3Fp ORDER+BY+DESC(%3Fcount)

We can see that there are 399,447,925 triples (approximately 400 million) connected by schema:description.

pcount
schema:description399447925
rdf:type84363276
jps:relationType72908233
jps:value72214780
schema:name57377225
schema:provider52481873

Counting Combinations of Subject and Object Types for a Specific Property

To understand the overview of the above triples, let’s count the combinations of subject types and object types that are linked by the schema:description property.

SELECT ?subjectType ?objectType (COUNT(*) AS ?count)
WHERE {
  ?subject schema:description ?object .
  ?subject rdf:type ?subjectType .
  optional {?object rdf:type ?objectType . }
}
GROUP BY ?subjectType ?objectType
ORDER BY DESC(?count)

The result is:

https://jpsearch.go.jp/rdf/sparql/easy/?query=SELECT+%3FsubjectType+%3FobjectType+(COUNT(*)+AS+%3Fcount) WHERE+{ ++%3Fsubject+schema%3Adescription+%3Fobject+. ++%3Fsubject+rdf%3Atype+%3FsubjectType+.+ ++optional+{%3Fobject+rdf%3Atype+%3FobjectType+.+} } GROUP+BY+%3FsubjectType+%3FobjectType ORDER+BY+DESC(%3Fcount)

There were approximately 90,000 triples with subjects that are instances of the type:図書 (Book) class.

subjectTypeobjectTypecount
type:図書87593194
type:動物標本47068657
type:植物標本46548944
type:アクセス情報33291083
type:雑誌21643930
type:行政文書11780814

Counting Instances

The following query counts the combinations of triples that have instances of the type:図書 (Book) class as subjects and schema:description as the property.

SELECT ?subject (COUNT(*) AS ?count)
WHERE {
  ?subject schema:description ?object .
  ?subject rdf:type type:図書 .
}
GROUP BY ?subject
ORDER BY desc(?count)

The result is:

https://jpsearch.go.jp/rdf/sparql/easy/?query=SELECT+%3Fsubject+(COUNT(*)+AS+%3Fcount) WHERE+{ ++%3Fsubject+schema%3Adescription+%3Fobject+. ++%3Fsubject+rdf%3Atype+type%3A図書+. } GROUP+BY+%3Fsubject ORDER+BY+desc(%3Fcount)

We can see that a single instance has many triples connected by schema:description.

By actually accessing the following, we can confirm that it indeed has 251 triples.

https://jpsearch.go.jp/data/bibnl-20601759

Summary

I hope this serves as a useful reference for analyzing the number of triples in an RDF store.