SKOS Guide for Information Professionals: Appendix

A Guide to Representing Structured Controlled Vocabularies in the Simple Knowledge Organization System

Priscilla Jane Frazier

March 2015

Article Contents: SKOS Defined  |  Elements of SKOS  |  SKOS Integrity Conditions  |  Literature Review  |  References  |  Appendix

Appendix: Foundational Standards and Context

SKOS is a data-sharing standard and was built upon several preexisting Semantic Web standards for formal logic and structure. These technologies provide ways of expressing meaning that are amenable to computation and that complement and give structure to information already existing on the web. The terms and definitions in this appendix aim to provide a context for how SKOS fits into the wider Semantic Web vision.


The eXtensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human- and machine-readable. XML documents use markup tags to describe elements of a given type of content. The language is “extensible” because markup tags are not predefined and must be invented by some human author. The markup describing an element’s content may include attributes that aid in description of that content. For example:

<?xml version="1.0" encoding="utf-8" ?>
<name_of_vegetable lang=“eng”>Carrot</name_of_vegetable>
<name_of_vegetable lang=“lat”>Daucus carota</name_of_vegetable>

Line one of the code includes an XML declaration, which declares information about the type of document that follows. Line two includes a markup tag of an element called vegetable. Line three includes a child element with some content (the word “Carrot”). This content is described as a name of the parent element (vegetable), and it is given an attribute (lang) of English. Line four includes a child element with some content (the term “Daucus carota”). This content is described as another name of the parent element (vegetable), and it is given an attribute (lang) of Latin. Line five includes a closing tag, which signals the end of the element.


The Resource Description Framework (RDF) is a metadata data model and method for conceptual description of information that provides a common syntax for the web. Using various languages such as RDFS or OWL, RDF can be implemented in web resources by way of triples. A triple is a subject-predicate-object statement about a web resource, which helps to describe that resource. For example, “a carrot is a vegetable” is a triple: a subject (“carrot”), a predicate (“is a”), and object (“vegetable”). RDF triples are unique because each component is associated with a unique Uniform Resource Identifier (URI). A collection of these RDF statements can form a powerful data model that can be used for many types of information organization and management. Take, for example, the triple <> "Carrot" .

This triple statement means that the resource has a title of “Carrot.” The subject of the triple is a URI for a term in a vocabulary (the veggie vocab). The predicate of this triple is a URI for the Dublin Core metadata element “title.” The object of this triple is the string “Carrot.” This same triple might be expressed in RDF as:

<rdf:Description rdf:about="">


The Resource Description Framework Schema (RDFS) is a formally defined knowledge representation language that provides a common data modeling language for data on the web. It can also be thought of as a “semantic extension” to RDF. Information represented in RDFS is described through classes and properties of those classes.


Like RDFS, the Web Ontology Language (OWL and OWL 2) is another, more expressive knowledge representation language that also provides a common data modeling language for data on the web. While fully compatible with RDFS, OWL is also able to augment the meaning of existing RDFS vocabularies. In addition, OWL includes several variants, or sub-languages, including OWL Lite, OWL DL, and OWL Full. OWL Full was designed to be compatible with RDFS, and it allows an ontology to extend the meaning of a given vocabulary. The following example uses OWL 2 syntax to declare that the veggie vocab is an ontology.

Declaration( Class( :Vegetable ) )


The Terse RDF Triple Language (Turtle) is a textual syntax for RDF that allows RDF statements to be written in compact and natural text format, with abbreviations for common usage patterns and data types. The example shows the use of Turtle syntax stating a triple using a SKOS relationship.

<> <> <> .


The SPARQL Protocol and RDF Query Language (SPARQL) is an RDF query language that provides a standard means for interacting with data on the web. This language allows for the retrieval and manipulation of data stored in RDF format. The following example demonstrates a query in which the user requests some information about a specific term in the veggie vocab. In this example the request is for the color of the term “Carrot.”

PREFIX veggieVocab: <>
SELECT ?vegetable ?color
?x abc:colorname ?color ;
abc:isColorOf ?y .
?y abc:vegetablename ?vegetable ;
abc:isColorOf abc:Carrot .

Line one of this SPARQL statement specifies the vocabulary from which to pull information. Line two specifies the specific type of information that will be queried. Line three signals the beginning of the query. Lines four, five, and six specify the way in which the information specified in line two will be queried, and line seven specifies the precise query “What color is a carrot?”

The application of any of these technologies over large bodies of information requires the construction of detailed maps of particular knowledge domains and the accurate description of information resources on a large scale. Most of this work cannot be done automatically, and this is where SKOS comes into play. Simply put, the SKOS data model is an OWL Full ontology and its data are expressed as RDF triples.