1. ISA Abstract Model¶
This ISA specification defines an Abstract Model of the metadata framework. The ISA Abstract Model has been implemented in two format specifications, ISA-Tab and ISA-JSON, both of which have supporting tools and services associated with them. The format specifications are also available for additional tooling to take advantage of ISA-formatted content.
The concept map below shows the ISA objects/entities and their relation to one another:
Note
The concept ontology reference
depicted above refers to a combination of the Ontology Annotation
and
Ontology Source
concepts as described below.
1.1. Investigation, Study, Assay¶
- The ISA model consists of three core entities to capture experimental metadata:
Investigation
Study
Assay
An Investigation
contains all the information needed to understand the overall goals and means used in an
experiment; experimental steps (or sequences of events) are described in a Study
and Assay
. For each
Investigation
there may be one or more Study
associated with it; for each Study
there may be one or more
Assay
.
1.1.1. Investigation¶
An Investigation
is intended to:
- to record metadata relating to a given investigation
- to link related
Study
objects under anInvestigation
(this only becomes necessary when two or moreStudy
objects need to be grouped)
An Investigation
is used to record metadata relating to the description of the investigation context, such as the title and
description of the investigation as well as about related people and scholarly publications. Study
and Assay
objects
are grouped within an Investigation
to record other metadata within the relevant contexts.
An Investigation
SHOULD record the following:
Property | Datatype | Description |
---|---|---|
Identifier | String | A identifier or an accession number provided by a repository. This SHOULD be locally unique. |
Title | String | A concise name given to the investigation. |
Description | String | A textual description of the investigation. |
Submission Date | Representation of a ISO8601 date | The date on which the investigation was reported to the repository. |
Public Release Date | Representation of a ISO8601 date | The date on which the investigation was released publicly. |
Publications | A list of Publication | A list of Publications relating to the investigation. |
Contacts | A list of Contact | A list of Contacts relating to the investigation. |
1.1.2. Study¶
A Study
is a central concept containing information on the subject under study, its characteristics and any
treatments applied.
A Study
contains contextualising information for one or more Assay
. Metadata about the study design, study
factors used, and study protocols are recorded in Study
objects, as well as information similarly to the
Investigation
including title and description of the study, and related people and scholarly publications.
A Study
SHOULD record the following:
Property | Datatype | Description |
---|---|---|
Identifier | String | A identifier or an accession number provided by a repository. This SHOULD be locally unique. |
Title | String | A concise name given to the investigation. |
Description | String | A textual description of the investigation. |
Submission Date | Representation of a ISO8601 date | The date on which the investigation was reported to the repository. |
Public Release Date | Representation of a ISO8601 date | The date on which the investigation was released publicly. |
Publications | A list of Publication | A list of Publications relating to the study. |
Contacts | A list of Contact | A list of Contacts relating to the study. |
Design Type | Ontology Annotation | A classifier of the study based on the overall experimental design, e.g cross-over design or parallel group design. |
Factor Name | String | The name of one factor used in the Study and/or Assay files. A factor corresponds to an independent variable manipulated by the experimentalist with the intention to affect biological systems in a way that can be measured by an assay. The value of a factor is given in the Study or Assay file, accordingly. |
Factor Type | Ontology Annotation | An classification of this factor into categories. |
In a Study
object we record the provenance of biological samples, from source material through a collection process to sample material, represented with directed acyclic graphs (direct graphs with no loops/cycles). The pattern of nodes is usually formed of a source material node, followed by a sample collection process node, followed by a sample material node.
For example:
(source material)->(sample collection)->(sample material)
These study graphs MAY split and pool depending on how the samples are collected.
In a splitting example, multiple samples might be derived from the same source:
(source material 1)->(sample collection)->(sample material 1)
(source material 1)->(sample collection)->(sample material 2)
In a pooling example, multiple sources may be used to create a single sample:
(source material 1)->(sample collection)->(sample material 1)
(source material 2)->(sample collection)->(sample material 1)
1.1.3. Assay¶
An Assay
represents a test performed either on material taken from a subject or on a whole initial subject,
producing qualitative or quantitative measurements.
An Assay
groups descriptions of provenance of sample processing for related tests. Each test typically
follows the steps of one particular experimental workflow described by a particular protocol.
Assay
-related metadata includes descriptions of the measurement type and technology used, and a link to what study
protocol is applied. Where an assay produces data files, links to the data are recorded here.
An Assay
SHOULD record the following:
Property | Datatype | Description |
---|---|---|
Measurement Type | Ontology Annotation | An Ontology Annotation to qualify the endpoint, or what is being measured (e.g. gene expression profiling or protein identification). |
Technology Type | Ontology Annotation | An Ontology Annotation to identify the technology used to perform the measurement, e.g. DNA microarray, mass spectrometry. |
Technology Platform | String | The manufacturer and platform name, e.g. Bruker AVANCE, of the technology used. |
In an Assay
we record the provenance of biological samples, from sample material through an experimental workflow, represented with directed acyclic graphs. Assay
graphs usually follow the pattern of a sample material, followed by a series of process and material/data nodes.
For example, to show a sample that goes through some extraction process (e.g. nucleic acid extraction) through to producing some sequenced data, we might produce something like:
(sample material)->(extraction process)->(extract)->(sequencing process)->(raw data file)
Like with the study graphs, splitting and pooling can occur where appropriate in assay graphs.
1.1.4. Study and Assay graphs¶
Experimental graphs relating to Study
and Assay
objects are made up of specific types of nodes.
Experimental graphs MUST be directed and acyclic (i.e. MUST NOT contain loops/cycles).
All nodes in Study
and Assay
graphs MUST be uniquely identifiable. User-defined identifiers MAY also be used.
Experimental graphs MUST be composed of the following node types
Material nodes
Material
nodes can also be used as a generic structure to describe materials consumed or produced during an experimental workflow.
Material
nodes SHOULD record the following:
Property | Datatype | Description |
---|---|---|
Characteristics | A list of Characteristic | A list of material characteristics that may be qualitative or quantitative in description. Qualitative values MAY be Ontology Annotations, while quantitative values MAY be qualified with a Unit definition. |
Material Type | Ontology Annotation | An Ontology Annotation describing the material. |
Source
nodes are a special kind of Material
node and are considered as the starting biological material used in a study.
Source
nodes SHOULD be followed by a Process
node describing a sample collection process, and SHOULD only appear in
Study
graphs.
Sample
nodes are a special kind of Material
node and represent major outputs resulting from a protocol application.
Sample
nodes in the Study
graphs SHOULD be preceded by a Process
node describing a sample collection process. Sample
nodes in the Assay
graphs SHOULD be followed by a Process
node and SHOULD NOT be preceded by any node.
Data nodes
Data
nodes represent outputs resulting from a protocol application that corresponds to some process that produces data, typically in the form of data files. Data
nodes SHOULD record the following:
Property | Datatype | Description |
---|---|---|
File name | String | A file name or full path referencing a data file produced by the related process that MAY be packaged with, or is accessible via, the ISA reference implementation content. |
Data
nodes SHOULD be preceded by a Process
node describing a data-producing process, such as NMR scanning or DNA sequencing.
Process nodes
Process
nodes represent the application of a protocol to some input material (e.g. a Source
) to produce some output (e.g.a Sample
).
Process
nodes SHOULD record the following:
Property | Datatype | Description |
---|---|---|
Parameter Values | A list of Parameter Value | Reporting on the values taken by parameters when applying a protocol. A protocol description in the Study SHOULD declare the required parameters, where here the values applied are recorded. |
Performer | String | Name of the operator who carried out the protocol. This allows account to be taken of operator effects and can be part of a quality control data tracking. |
Date | Representation of an ISO8601 date | The date on which a protocol is performed. This allows account to be taken of day effects and can be part of a quality control data tracking. |
Process
nodes SHOULD be preceded by zero or more Material
or Data
nodes, and followed by zero or more Material
or Data
nodes.
1.2. Ontology Annotation¶
For a given value, an Ontology Annotation
SHOULD qualify this value with an accession number taken from an Ontology
Source
.
An Ontology Annotation
SHOULD record the following:
Property | Datatype | Description |
---|---|---|
Accession Number | String or URI | The accession number or reference from the Ontology Source associated with the selected term. |
1.3. Ontology Source¶
An Ontology Source
describes the resource from which the value of an Ontology Annotation
is derived from.
An Ontology Source
SHOULD be referenced by an Ontology Annotation
. An Ontology Source
should contain enough information on which to
be able to ascertain the provenance of an Ontology Source
.
An Ontology Source
SHOULD record the following:
Property | Datatype | Description |
---|---|---|
Name | String | The name of the source of a term; i.e. the source controlled vocabulary or ontology. These names will be used to reference the Ontology Source from an Ontology Annotation. |
File | String | A file name or a URI of an official resource. |
Version | String | The version number of the Term Source to support terms tracking. |
1.4. Unit¶
A Unit
is used to classify dimensional data, and used accordingly with relevant values.
A Unit
SHOULD be implemented as an Ontology Annotation
.
1.5. Publication¶
A Publication
SHOULD record the following:
Property | Datatype | Description |
---|---|---|
PubMed ID | Representation of a PubMed ID | The PubMed IDs of the described publication(s) associated with this investigation. |
DOI | Representation of a DOI | A Digital Object Identifier (DOI) for that publication (where available). |
Author List | A list of Strings | The list of authors associated with that publication. |
Title | String | The title of publication associated with the investigation. |
Status | Ontology Annotation | An Ontology Annotation describing the status of that publication (i.e. submitted, in preparation, published). |
1.6. Contact¶
A Contact
SHOULD record the following:
Property | Datatype | Description |
---|---|---|
Name | String | The name of a person. |
Representation of an email | The email address of a person. | |
Phone | Representation of a phone number | The telephone number of a person. |
Address | Multi-line string | The address of a person. |
Affiliation | String | The organization affiliation for a person. |
Roles | A list of Ontology Annotations | Ontology Annotations to classify the roles performed by this person in the context of an Investigation or Study. |