1. ISA Abstract Model¶
This ISA specification defines an Abstract Model of the metadata framework. The ISA Abstract Model has been implemented in two format specifications, ISA-Tab and ISA-JSON, both of which have supporting tools and services associated with them. The format specifications are also available for additional tooling to take advantage of ISA-formatted content.
The concept map below shows the ISA objects/entities and their relation to one another:
Note
The concept ontology reference depicted above refers to a combination of the Ontology Annotation and
Ontology Source concepts as described below.
1.1. Investigation, Study, Assay¶
- The ISA model consists of three core entities to capture experimental metadata:
InvestigationStudyAssay
An Investigation contains all the information needed to understand the overall goals and means used in an
experiment; experimental steps (or sequences of events) are described in a Study and Assay . For each
Investigation there may be one or more Study associated with it; for each Study there may be one or more
Assay.
1.1.1. Investigation¶
An Investigation is intended to:
- to record metadata relating to a given investigation
- to link related
Studyobjects under anInvestigation(this only becomes necessary when two or moreStudyobjects need to be grouped)
An Investigation is used to record metadata relating to the description of the investigation context, such as the title and
description of the investigation as well as about related people and scholarly publications. Study and Assay objects
are grouped within an Investigation to record other metadata within the relevant contexts.
An Investigation SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| Identifier | String | A identifier or an accession number provided by a repository. This SHOULD be locally unique. |
| Title | String | A concise name given to the investigation. |
| Description | String | A textual description of the investigation. |
| Submission Date | Representation of a ISO8601 date | The date on which the investigation was reported to the repository. |
| Public Release Date | Representation of a ISO8601 date | The date on which the investigation was released publicly. |
| Publications | A list of Publication | A list of Publications relating to the investigation. |
| Contacts | A list of Contact | A list of Contacts relating to the investigation. |
1.1.2. Study¶
A Study is a central concept containing information on the subject under study, its characteristics and any
treatments applied.
A Study contains contextualising information for one or more Assay. Metadata about the study design, study
factors used, and study protocols are recorded in Study objects, as well as information similarly to the
Investigation including title and description of the study, and related people and scholarly publications.
A Study SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| Identifier | String | A identifier or an accession number provided by a repository. This SHOULD be locally unique. |
| Title | String | A concise name given to the investigation. |
| Description | String | A textual description of the investigation. |
| Submission Date | Representation of a ISO8601 date | The date on which the investigation was reported to the repository. |
| Public Release Date | Representation of a ISO8601 date | The date on which the investigation was released publicly. |
| Publications | A list of Publication | A list of Publications relating to the study. |
| Contacts | A list of Contact | A list of Contacts relating to the study. |
| Design Type | Ontology Annotation | A classifier of the study based on the overall experimental design, e.g cross-over design or parallel group design. |
| Factor Name | String | The name of one factor used in the Study and/or Assay files. A factor corresponds to an independent variable manipulated by the experimentalist with the intention to affect biological systems in a way that can be measured by an assay. The value of a factor is given in the Study or Assay file, accordingly. |
| Factor Type | Ontology Annotation | An classification of this factor into categories. |
In a Study object we record the provenance of biological samples, from source material through a collection process to sample material, represented with directed acyclic graphs (direct graphs with no loops/cycles). The pattern of nodes is usually formed of a source material node, followed by a sample collection process node, followed by a sample material node.
For example:
(source material)->(sample collection)->(sample material)
These study graphs MAY split and pool depending on how the samples are collected.
In a splitting example, multiple samples might be derived from the same source:
(source material 1)->(sample collection)->(sample material 1)
(source material 1)->(sample collection)->(sample material 2)
In a pooling example, multiple sources may be used to create a single sample:
(source material 1)->(sample collection)->(sample material 1)
(source material 2)->(sample collection)->(sample material 1)
1.1.3. Assay¶
An Assay represents a test performed either on material taken from a subject or on a whole initial subject,
producing qualitative or quantitative measurements.
An Assay groups descriptions of provenance of sample processing for related tests. Each test typically
follows the steps of one particular experimental workflow described by a particular protocol.
Assay-related metadata includes descriptions of the measurement type and technology used, and a link to what study
protocol is applied. Where an assay produces data files, links to the data are recorded here.
An Assay SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| Measurement Type | Ontology Annotation | An Ontology Annotation to qualify the endpoint, or what is being measured (e.g. gene expression profiling or protein identification). |
| Technology Type | Ontology Annotation | An Ontology Annotation to identify the technology used to perform the measurement, e.g. DNA microarray, mass spectrometry. |
| Technology Platform | String | The manufacturer and platform name, e.g. Bruker AVANCE, of the technology used. |
In an Assay we record the provenance of biological samples, from sample material through an experimental workflow, represented with directed acyclic graphs. Assay graphs usually follow the pattern of a sample material, followed by a series of process and material/data nodes.
For example, to show a sample that goes through some extraction process (e.g. nucleic acid extraction) through to producing some sequenced data, we might produce something like:
(sample material)->(extraction process)->(extract)->(sequencing process)->(raw data file)
Like with the study graphs, splitting and pooling can occur where appropriate in assay graphs.
1.1.4. Study and Assay graphs¶
Experimental graphs relating to Study and Assay objects are made up of specific types of nodes.
Experimental graphs MUST be directed and acyclic (i.e. MUST NOT contain loops/cycles).
All nodes in Study and Assay graphs MUST be uniquely identifiable. User-defined identifiers MAY also be used.
Experimental graphs MUST be composed of the following node types
Material nodes
Material nodes can also be used as a generic structure to describe materials consumed or produced during an experimental workflow.
Material nodes SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| Characteristics | A list of Characteristic | A list of material characteristics that may be qualitative or quantitative in description. Qualitative values MAY be Ontology Annotations, while quantitative values MAY be qualified with a Unit definition. |
| Material Type | Ontology Annotation | An Ontology Annotation describing the material. |
Source nodes are a special kind of Material node and are considered as the starting biological material used in a study.
Source nodes SHOULD be followed by a Process node describing a sample collection process, and SHOULD only appear in
Study graphs.
Sample nodes are a special kind of Material node and represent major outputs resulting from a protocol application.
Sample nodes in the Study graphs SHOULD be preceded by a Process node describing a sample collection process. Sample nodes in the Assay graphs SHOULD be followed by a Process node and SHOULD NOT be preceded by any node.
Data nodes
Data nodes represent outputs resulting from a protocol application that corresponds to some process that produces data, typically in the form of data files. Data nodes SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| File name | String | A file name or full path referencing a data file produced by the related process that MAY be packaged with, or is accessible via, the ISA reference implementation content. |
Data nodes SHOULD be preceded by a Process node describing a data-producing process, such as NMR scanning or DNA sequencing.
Process nodes
Process nodes represent the application of a protocol to some input material (e.g. a Source) to produce some output (e.g.a Sample).
Process nodes SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| Parameter Values | A list of Parameter Value | Reporting on the values taken by parameters when applying a protocol. A protocol description in the Study SHOULD declare the required parameters, where here the values applied are recorded. |
| Performer | String | Name of the operator who carried out the protocol. This allows account to be taken of operator effects and can be part of a quality control data tracking. |
| Date | Representation of an ISO8601 date | The date on which a protocol is performed. This allows account to be taken of day effects and can be part of a quality control data tracking. |
Process nodes SHOULD be preceded by zero or more Material or Data nodes, and followed by zero or more Material or Data nodes.
1.2. Ontology Annotation¶
For a given value, an Ontology Annotation SHOULD qualify this value with an accession number taken from an Ontology
Source.
An Ontology Annotation SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| Accession Number | String or URI | The accession number or reference from the Ontology Source associated with the selected term. |
1.3. Ontology Source¶
An Ontology Source describes the resource from which the value of an Ontology Annotation is derived from.
An Ontology Source SHOULD be referenced by an Ontology Annotation. An Ontology Source should contain enough information on which to
be able to ascertain the provenance of an Ontology Source.
An Ontology Source SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| Name | String | The name of the source of a term; i.e. the source controlled vocabulary or ontology. These names will be used to reference the Ontology Source from an Ontology Annotation. |
| File | String | A file name or a URI of an official resource. |
| Version | String | The version number of the Term Source to support terms tracking. |
1.4. Unit¶
A Unit is used to classify dimensional data, and used accordingly with relevant values.
A Unit SHOULD be implemented as an Ontology Annotation.
1.5. Publication¶
A Publication SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| PubMed ID | Representation of a PubMed ID | The PubMed IDs of the described publication(s) associated with this investigation. |
| DOI | Representation of a DOI | A Digital Object Identifier (DOI) for that publication (where available). |
| Author List | A list of Strings | The list of authors associated with that publication. |
| Title | String | The title of publication associated with the investigation. |
| Status | Ontology Annotation | An Ontology Annotation describing the status of that publication (i.e. submitted, in preparation, published). |
1.6. Contact¶
A Contact SHOULD record the following:
| Property | Datatype | Description |
|---|---|---|
| Name | String | The name of a person. |
| Representation of an email | The email address of a person. | |
| Phone | Representation of a phone number | The telephone number of a person. |
| Address | Multi-line string | The address of a person. |
| Affiliation | String | The organization affiliation for a person. |
| Roles | A list of Ontology Annotations | Ontology Annotations to classify the roles performed by this person in the context of an Investigation or Study. |