mzIdentML Conformance to MIAPE

This table lists each point in the MIAPE guidelines and states the xpath/CV available to provide conformance

Do not edit this page directly because the editor on psidev.info is useless for tables. Source for this page is under svn here. You should edit the source html and copy/paste to here.

The MIAPE document is available here, and general information about MIAPE is here.

MIAPE Section Item xPath (under mzIdentML) Notes a b c d e f g h i j
1 Date stamp (as YYYY-MM-DD) creationDate (attribute) The creation date of the document itself. xsd:dateTime    
AnalysisCollection/SpectrumIdentification/activityDate (attribute) Date spectrum identification performed. xsd:dateTime  
AnalysisCollection/ProteinDetection/activityDate (attribute) Date protein inferencing performed. xsd:dateTime na na
Responsible person (or institutional role if more appropriate); provide name, affiliation and stable contact information Provider/ContactRole An institutional email address can generally satisfy this requirement.    
Software name, version and manufacturer AnalysisSoftwareList/AnalysisSoftware/name  
AnalysisSoftwareList/AnalysisSoftware/version    
AnalysisSoftwareList/AnalysisSoftware/ContactRole  
Customisations made to that software AnalysisSoftwareList/AnalysisSoftware/Customizations No customisations in some examples for illustration.
In the other cases this is just not applicable (na).
na na na na na na
Availability of that software AnalysisSoftwareList/AnalysisSoftware/URI The references of the vendor or public url if a publicly available version has been used.    
Location of the files generated; parameter files, spectral data (input/output) DataCollection/Inputs/SourceFile The location of the data generated. If made available in a public repository, describe the URI (for instance an url, or the url of the repository and the information on how to retrieve the data). If not made available for public access, describe the contact person reference or source and the internal coordinates of the data. e.g. Sequest .out, Mascot .dat. [Note to MIAPE Authors: This is confusing because of overlap with next section, so we just consider Inputs/SourceFile here and not the .dta files etc.].  
2 Input data – Description and type of MS data DataCollection/Inputs/SpectraData Provide a short description that can refer to the data in the experiment (e.g. LC-MS run1). [Refer to mzML source file for information - outside scope of mzIdentML]                    
DataCollection/Inputs/SpectraData/fileFormat        
Input data – Availability of MS data (source of data) DataCollection/Inputs/SpectraData Location (URI) of input data file  
Input parameters - Databases queried; description and versions (including number of entries searched) DataCollection/Inputs/SearchDatabase/DatabaseName
and/or
DataCollection/Inputs/SearchDatabase/location
   
DataCollection/Inputs/SearchDatabase/version      
DataCollection/Inputs/SearchDatabase/numDatabaseSequences   na
Input parameters - Taxonomical restrictions applied AnalysisProtocolCollection/SpectrumIdentificationProtocol/DatabaseFilters Specify the ... subset of the databank(s) (for instance, “mammals”, a NCBI TaxId, a list of accession numbers). na na na na na na na na
DataCollection/AnalysisData/SpectrumIdentificationList/numSequencesSearched Specify the number of entries searched. na na na na na na na na
Input parameters - Description of tool and scoring scheme AnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams/cvParam Descriptor of the scoring algorithm in the search engine (such as ESI-TRAP in Mascot, ESI... [Note to MIAPE authors: These examples parameters are a little search engine specific]
Input parameters - Specified cleavage agent(s) AnalysisProtocolCollection/SpectrumIdentificationProtocol/Enzymes Describe the cleavage agent as available on the search engine. If the cleavage agent rules have been defined by the user, describe the cleavage rules)
Input parameters - Allowed number of missed cleavages AnalysisProtocolCollection/SpectrumIdentificationProtocol/Enzymes/Enzyme/missedCleavages Allowed maximum number of cleavage sited missed by the specified agent during the in-silico cleavage process. For a no eznyme search, use the "No Enzyme" CV term, and omit the number of missed cleavages.
Input parameters - Additional parameters related to cleavage AnalysisProtocolCollection/SpectrumIdentificationProtocol/Enzymes The Enzymes section is flexible. Example 'a' shows a case of a mixed enzyme. na na na na na na na na na
Input parameters - Permissible amino acids modifications AnalysisProtocolCollection/SpectrumIdentificationProtocol/ModificationParams/SearchModification Using the PSI-MS names available from Unimod na na na na
Input parameters - Precursor-ion and fragment ion mass tolerance for tandem MS (when applicable) AnalysisProtocolCollection/SpectrumIdentificationProtocol/FragmentTolerance
AnalysisProtocolCollection/SpectrumIdentificationProtocol/ParentTolerance
  na na
Input parameters - Mass tolerance for PMF (when applicable) AnalysisProtocolCollection/SpectrumIdentificationProtocol/ParentTolerance   na na na na na na na na na
Input parameters - Thresholds; minimum scores for peptides, proteins (probabilities, number of hits, other metrics) AnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams/cvParam                    
AnalysisProtocolCollection/ProteinDetectionProtocol/AnalysisParams/cvParam   na     na
Input parameters - Any other relevant parameters AnalysisProtocolCollection/SpectrumIdentificationProtocol/AdditionalSearchParams/cvParam  
3 Identified proteins - Accession code in the queried database SequenceCollection/DBSequence/accession  
Identified proteins - Protein description SequenceCollection/DBSequence/cvParam accession="MS:1001088"   na     na
Identified proteins - Protein scores DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam   na na na
Identified proteins - Validation status DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam accession="MS:1001060" For all protein hits in the search, specify if accepted without post-processing of search engine/de-novo interpretation (accept raw output of identification software) or if manually accepted as valid or as rejected (false positive).           na       na
Identified proteins - Number of different peptide sequences (without considering modifications) assigned to the protein DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam
accession="MS:1001097"
  na     na
Identified proteins - Percent peptide coverage of protein DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam
accession="MS:1001093"
  na     na
Identified proteins - Identity of supporting peptides DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/PeptideHypothesis   na na
Identified proteins - In the case of PMF, number of matched/unmatched peaks DataCollection/AnalysisData/ProteinDetectionList/ProteinAmbiguityGroup/ProteinDetectionHypothesis/cvParam
accession="MS:1001097" name="distinct peptide sequences"
accession="MS:1001362" name="number of unmatched peaks"
  na na na na na na na na na
For identified peptides - Sequence (indicate any deviation from the expected protein cleavage specificity) SequenceCollection/Peptide/peptideSequence  
For identified peptides - Peptide scores DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/cvParam   na
For identified peptides - Chemical modifications (artefactual) and post-translational modifications (naturallyoccurring); sequence polymorphisms with experimental evidence (particularly for isobaric modifications) SequenceCollection/Peptide/Modification   na na na
For identified peptides - Corresponding spectrum locus DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/start and end      
For identified peptides - Charge assumed for identification and a measurement of peptide mass error DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/chargeState    
DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/calculatedMassToCharge - DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/experimentalMassToCharge    
For identified peptides - Other additional information, when used for evaluation of confidence DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/cvParam  
Quantitation for selected ions - Quantitation approach (e.g. 4plex-iTRAQ, ICAT, cICAT, COFRADIC) Out of scope Planned for mzQuantML                    
Quantitation for selected ions - Quantity measurement (e.g. integration of signals, use of signal intensity) Out of scope Planned for mzQuantML                    
Quantitation for selected ions - Data transformation and normalisation technique (description of method and software) Out of scope Planned for mzQuantML                    
Quantitation for selected ions - Number of replicates (biological and technical) Out of scope Planned for mzQuantML                    
Quantitation for selected ions - Acceptance criteria (including measure of errors) Out of scope Planned for mzQuantML                    
Quantitation for selected ions - Estimates of uncertainty and the methods for the error analysis, including the treatment of relevant systematic error effects and the treatment of random error issues. Results from controls (when described) Out of scope Planned for mzQuantML                    
4 Assessment and confidence given to the identification and quantitation (description of methods, thresholds, values, etc,) AnalysisProtocolCollection/SpectrumIdentificationProtocol/Threshold
and
ProteinDetectionProtocol/Threshold
For example, MS:1001316, mascot:SigThreshold    
4 Results of statistical analysis or determination of false positive rate in case of large scale experiments                    
4 Inclusion/exclusion of the output of the software are provided (description of what part of the output has been kept, what part has been rejected) DataCollection/AnalysisData/SpectrumIdentificationList/SpectrumIdentificationResult/SpectrumIdentificationItem/ @passThreshold