mzIdentML Development Timeline

1. Spring 2006, Meeting in San Francisco, USA – start of a UML model for AnalysisXML (universal standard for all types of proteome informatics)

Orchard et al. Proteomics 2006, 6, 4439–4443:

“PSI-Proteomics Informatics (PSI-PI) working group now has responsibility for the production of the mass spectrometry informatics standards, such as analysisXML, which will cover, among other things, protein identification reporting. The remit of the groups is to produce a UML data model with an XML implementation, example instance documentation, a validation tool, and an accompanying ontology. The use cases were reviewed and expanded upon and the existing version analysisXML reviewed in the light of these use cases. Migration to a UML model should be achieved in time for ASMS in order to generate an XML schema for public viewing.”

2.  Fall 2006, meeting in Washington DC, USA  - continued work on the AnalysisXML schema

Orchard et al. Proteomics 2007, 7, 337–339:

“Work also continued on AnalysisXML, with the revision of a file containing a list of information available in output files from a majority of currently available search engines. A number of common elements have been mapped to the current model and have been associated to appropriate CV terms.”

3. Spring 2007, Meeting in Lyon, France – continued work on the AnalysisXML schema

Orchard et al.  Proteomics 2007, 7, 3436–3440:

“A draft analysisXML schema and example instance documents were produced at the PSI Autumn 2005 workshop in Washington [5]. In the last few months, feedback has been received from all the major search engine vendors on the parameter spreadsheet and a draft CV prepared as an .OBO file. The aims of the meeting were to further develop the schema, review the instance document and improve the general documentation. By the end of the workshop, the schema had been tested against all of the MIAPE-MSI requirements, with the exception of the requirements for quantification for which a structure has been discussed. SILAC and iTRAQ features have been added as a feature set and these sets can be combined to give a ratio. Instance documents were reviewed and modified with new use cases such as top-down, mixed MS and MS/MS, de Novo sequencing and error tolerant tag searches discussed. Protein inference analysis has been more clearly split from peptide identification. Finally, a decision was made to put the terms required by two or more search engines directly into the schema as attributes/elements rather than described in a CV.”

4. Spring 2008, Meeting in Toledo, Spain – agreed to switch to direct XSD development to speed completion

Orchard et al.  Proteomics 2008, 8, 4168–4172:

“The development of analysisXML has proven far from straightforward, partly because the scope of the project has changed often in a fast moving field. The main aim of this meeting was to readdress the goals of the project and produce a timeline for completing the first release. Fundamental questions such as whether it is practical to try to write a schema that can cover all scenarios, including quantitation support, in the first implementation were considered.

analysisXML has been developed as an extension to FuGE by creating a schema from UML. It was agreed to continue by developing the XML schema directly and extending a cut-down version of the FuGE xsd. Rather than use the FuGE format for the controlled vocabulary, it was agreed to use the same format as for mzML version 1.0. It was also agreed that quantitation will not be addressed until version 2.0. However, a scheme was developed that will ensure that version 1.0 documents will be backwards compatible with the 2.0 schema. Development of quantitation support will be carried out in parallel to the version 1.0 release.”


5. December 2008, submission of AnalysisXML to the PSI document process


6. Spring 2009, Meeting in Turku, Finland  – AnalysisXML split into identification format (mzIdentML) and quantitation (mzQuantML), minor changes to the schema

Orchard et al.  Proteomics 2009, 9, 4429–4432

“The scope of the current format is limited to protein identification and the format previously known as AnalysisXML has been renamed mzIdentML to reflect this. The resources now include semantic validation tools, specification document, tables of conformance to both the MIAPE and MCP guidelines and 12 example instance documents. A manuscript has been prepared and the format was submitted to the PSI document review process in December 2008. The feedback from this process has resulted in a minor set of changes to the schema, documentation and examples.”


“It is now planned to develop a separate schema, mzQuantML, with a structure broadly similar to mzIdentML to add the ability to handle quantitation data.”


7. August 2009 – completion of the PSI document process and formal release of version 1 of mzIdentML


8. 2010-2011  - Some minor issues identified with verbosity in the files, and some redundant information captured. A few minor bugs identified. A decision was taken to fix all bugs in one go and release a new1.1 version


9. August 2011 - Version 1.1 released from the PSI document process, and considered to be the stable development release of the format.