MIame Notation In Markup Language (MINiML) version 0.3.0 Dennis B. Troup Note that all elements are defined within the MINiML namespace. Many of the top level elements have an "iid" (internal identifier) attribute. These identifiers are only intended to be unique within the document. They are not meant to be globally unique identifers. Accessions should be used to identify elements globally. This is the top element of an MINiML document. For easier processing, elements should be arranged so that each is defined before it is referenced. This order is enforced for the Database, Organization and Contributor elements, but not for the Platform, Sample and Series elements. This element is extensible with vendor or application specific elements. These may appear after the standard elements. If possible, these elements will be validated as conforming to their schema. Titles can't be empty, nor too long. Short descriptions can't be empty, nor too long. Required tokens can't be empty. Required strings can't be empty A web link. Delimiters are a single character. An MD5 128-bit checksum expressed in hexidecimal. A link to supplementary data. RefType allows references to other elements OrderedRefType allows references to other elements with an optional position specification. This allows the order of the references to be specified independently of the lexical order of the elements. The lexcical order, though, should be considered the default order if the position is not specified. The database element is used to define the databases which give context to the accessions. The accession type defines an identify in an external database. The ref attribute should contain the value of the iid of a database element. A record of an entity'status. The status can be specified with respect to a particular database. This element is extensible with vendor or application specific elements. These may appear after the standard elements. If possible, these elements will be validated as conforming to their schema. Additionally, vender and application specific attributes may be added. The standard definition of an address. The standard defintion of a person's name. The Organization element provides a central location for defining the organization to which one or more of the the contributors belong. An organaztion only needs to be defined once in the document. A Contributor is a person or organization that contributes in some role to the production of a Platform, Sample, or Series. It may be as an author of the paper describing the series, a person that processes the microarray, the organization that sponsored the research, the person who analyzes the data, etc. This element is extensible with vendor or application specific elements. These may appear after the standard elements. If possible, these elements will be validated as conforming to their schema. Additionally, vender and application specific attributes may be added. OrderedContributorType allows contributors to be defined in a specific order. The TableType defines a data table. The column subelements describe the columns in the table. The number of Column subelements should match the number of columns in the data table. The Columns may have their position within the table explicitly specified. This type is extensible with vendor or application specific elements. These may appear after the standard elements. If possible, these elements will be validated as conforming to their schema. Additionally, vender and application specific attributes may be added. The data table may either be included within via the Internal-Data subelement, or be external. How external data tables are associated with a data table definition is beyond the scope of this schema. It is highly suggested that the value of the External-Data element be used as an reference that is dynamically bound, in an implementation- specific manner, by each application. Expecting the value to be a path to an external file, or specific URI is not portable and not recommended. The predefined platform technology types. The predefined platform distribution types. A platform describes the platform that forms the basis of a Sample. Some platforms are virtual and do not phyiscal exists, e.g. SAGE platforms. This element is extensible with vendor or application specific elements. These may appear after the standard elements. If possible, these elements will be validated as conforming to their schema. Additionally, vender and application specific attributes may be added. Submission Guidelines: Provide a unique title that describes your Platform. We suggest that you use the system [institution/lab][species][number of features][version], e.g., 'FHCRC Mouse 15K v1.0'. Submission Guidelines: Select the category that best describes the Platform technology. Submission Guidelines: Microarrays are commercial, non-commercial, or custom-commercial in accordance with how the array was manufactured. Use virtual only if creating a virtual definition for MS, MPSS, SARST, or RT-PCR data. Submission Guidelines: Identify the organism(s) from which the features on the Platform were designed or derived. Use standard NCBI Taxonomy nomenclature. Submission Guidelines: Provide the name of the company, facility or laboratory where the Platform was manufactured or produced. Submission Guidelines: Describe the Platform manufacture protocol Include as much detail as possible, e.g., clone/primer set identification and preparation, strandedness/length, arrayer hardware/software, spotting protocols. It is strongly recommended that complete protocol descriptions are provided within your submission. References to published protocol descriptions are acceptable - please provide complete citation information. Links to web sites that provide protocol information are not recommended since web addresses and content often change. Submission Guidelines: Provide the manufacturer catalog number for commercially-available Platforms. Submission Guidelines: Provide the surface type of the Platform, e.g., glass, nitrocellulose, nylon, silicon, unknown. Submission Guidelines: Provide the coating of the Platform, e.g., aminosilane, quartz, polysine, unknown. Submission Guidelines: Provide any additional descriptive information not captured in another field, e.g., array and/or feature physical dimensions, element grid system. Submission Guidelines: Specify a web link(s) that directs users to supplementary information about the Platform. Please restrict to web sites that you know are stable. Submission Guidelines: Specify a valid PubMed identifier (PMID) that references a published article that describes the Platform. Submission Guidelines: Examples of Platform supplementary data include original GAL and CSV files. Supplementary files can be zipped or tarred together with the MINiML file at time of submission. Submission Guidelines: Data-Tables can be supplied either within the MINiML file (Internal-Data), or can be external files (External-Data) External-Data files should be zipped or tarred together with the MINiML file at the time of submission. A full description of Platform data tables, required columns, content and restrictions is provided in our web submission documentation at http://www.ncbi.nlm.nih.gov/projects/geo/info/depguide.html#DataTableGPL One difference to note is that data tables do not have headers in MINiML files - table columns are defined by position. The predefined sample types. The predefined Sample molecules. The predefined SRA library strategies. The predefined SRA library sources. The predefined SRA library selections. The predefined instrument models. SRA Instrument model. Channels exists within a Sample. This type is extensible with vendor or application specific elements. These may appear after the standard elements. If possible, these elements will be validated as conforming to their schema. Additionally, vender and application specific attributes may be added. Submission Guidelines: Briefly identify the biological material and the experimental variable(s) for this Sample, e.g., vastus lateralis muscle, exercised, 60 min. Submission Guidelines: Identify the organism(s) from which the biological material was derived. Use standard NCBI Taxonomy nomenclature. Submission Guidelines: List all available characteristics of the biological source, e.g., Strain: C57BL/6 Gender: female Age: 45 days Tissue: bladder tumor Tumor stage: Ta Submission Guidelines: Specify the name of the company, laboratory or person that provided the biological material. Submission Guidelines: Describe any treatments applied to the biological material prior to extract preparation. It is strongly recommended that complete protocol descriptions are provided within your submission. References to published protocol descriptions are acceptable - please provide complete citation information. Links to web sites that provide protocol information are not recommended since web addresses and content often change. Submission Guidelines: Describe the conditions that were used to grow or maintain organisms or cells prior to extract preparation. It is strongly recommended that complete protocol descriptions are provided within your submission. References to published protocol descriptions are acceptable - please provide complete citation information. Links to web sites that provide protocol information are not recommended since web addresses and content often change. Submission Guidelines: Specify the type of molecule that was extracted from the biological material. Submission Guidelines: Describe the protocol used to isolate the extract material. It is strongly recommended that complete protocol descriptions are provided within your submission. References to published protocol descriptions are acceptable - please provide complete citation information. Links to web sites that provide protocol information are not recommended since web addresses and content often change. Submission Guidelines: Specify the compound used to label the extract e.g., biotin, Cy3, Cy5, 33P. Submission Guidelines: Describe the protocol used to label the extract. It is strongly recommended that complete protocol descriptions are provided within your submission. References to published protocol descriptions are acceptable - please provide complete citation information. Links to web sites that provide protocol information are not recommended since web addresses and content often change. A Sample describes the properties of one sample. A sample has one or more channels and a data table of measured values. This element is extensible with vendor or application specific elements. These may appear after the standard elements. If possible, these elements will be validated as conforming to their schema. Additionally, vender and application specific attributes may be added. Submission Guidelines: Provide a unique title that describes this Sample. We suggest that you use the system [biomaterial][condition(s)][replicate number] e.g., Muscle_exercised_60min_rep2. Submission Guidelines: Supply for SAGE submissions only (this field is derived automatically for other Sample types using the Molecule type). Submission Guidelines: Supply for SAGE submissions only. State the enzyme anchor, usually NlaIII or Sau3A Submission Guidelines: Supply for SAGE submissions only. State the base pair length of the SAGE tags, excluding anchor sequence. Submission Guidelines: Supply for SAGE submissions only. State the sum number of tags quantified in this Sample. Submission Guidelines: State the number of channels in the experiment, e.g., two-color hybridizations are typically 2-channel, Affymetrix hybridizations are typically 1-channel. Submission Guidelines: Describe the protocols used for hybridization, blocking and washing, and any post-processing steps such as staining. It is strongly recommended that complete protocol descriptions are provided within your submission. References to published protocol descriptions are acceptable - please provide complete citation information. Links to web sites that provide protocol information are not recommended since web addresses and content often change. Submission Guidelines: Describe the scanning and image acquisition protocols, hardware, and software. It is strongly recommended that complete protocol descriptions are provided within your submission. References to published protocol descriptions are acceptable - please provide complete citation information. Links to web sites that provide protocol information are not recommended since web addresses and content often change. Submission Guidelines: Include any additional information not provided in the other fields, or paste in broad descriptions that cannot be easily dissected into the other fields. Submission Guidelines: Provide details of how data in the VALUE column of your data table were generated and calculated, i.e., normalization method, data selection procedures and parameters, transformation algorithm and scaling parameters. Submission Guidelines: Reference the Platform iid upon which this Sample is based. Submission Guidelines: Examples of Sample supplementary data include original GPR, CEL, EXP, RPT, CAB, and TIFF files. Supplementary files can be zipped or tarred together with the MINiML file at time of submission. Submission Guidelines: SRA raw data files. Submission Guidelines: Data-Tables can be supplied either within the MINiML file (Internal-Data), or can be external files (External-Data). External-Data files should be zipped or tarred together with the MINiML file at the time of submission. A full description of Sample data tables, required columns, content and restrictions is provided in our web submission documentation: single channel: http://www.ncbi.nlm.nih.gov/projects/geo/info/depguide.html#scDataTableGSM dual channel: http://www.ncbi.nlm.nih.gov/projects/geo/info/depguide.html#dcDataTableGSM SAGE:http://www.ncbi.nlm.nih.gov/projects/geo/info/depguide.html#sageDataTableGSM One difference to note is that data tables do not have headers in MINiML files - table columns are defined by position. Series variable types. A description of one variable of an series and its associated samples. Series repeats types. A description of an repeats of an series and its associated samples. An series is a collection of related samples and a description of how they form an series. This element is extensible with vendor or application specific elements. These may appear after the standard elements. If possible, these elements will be validated as conforming to their schema. Additionally, vender and application specific attributes may be added. Submission Guidelines: Provide a unique title that describes the overall study. Submission Guidelines: Specify valid PubMed identifier(s) (PMID) that reference a published article describing this study. Most commonly, this information is not available at the time of submission - it can be added later once the data are published. Submission Guidelines: Specify a web link(s) that directs users to supplementary information about the study. Please restrict to web sites that you know are stable. Submission Guidelines: Summarize the goals and objectives of this study. The abstract from the associated publication may be suitable. Submission Guidelines: Provide a brief description of the experimental design. Indicate how many Samples are analyzed, if replicates are included, are there control and/or reference Samples, dye-swaps, etc. Submission Guidelines: Reference all the Sample iid that make up this Series. Submission Guidelines: Series supplementary data is usually a TAR file that GEO staff generate from all Platform and Sample supplementary files associated with this Series. It is not necessary for submitters to generate this TAR file.