In the table below, MIME type is the Multipurpose Internet Mail Extensions (MIME) type identifier; for more information on MIME, see the MIME RFCs or the MIME FAQ. Description is what most people use as the name for the format. Extensions are typical file name extensions (the part after the dot, e.g. the extension for "index.html" is "html"). These are not case-sensitive in DSpace, so either "sample.XML" or "sample.xml" will be recognized as XML. Level is DSpace's support level for each format:
Please see the full format policy below for a complete explanation of these terms.
MIME type | Description | Extensions | Level |
---|---|---|---|
application/octet-stream | Unknown | unsupported | |
application/pdf | Adobe PDF | supported | |
text/xml | XML | xml | supported |
text/plain | Text | txt, asc | supported |
text/html | HTML | htm, html | supported |
application/msword | Microsoft Word | doc | known |
application/vnd.ms-powerpoint | Microsoft Powerpoint | ppt | known |
application/vnd.ms-excel | Microsoft Excel | xls | known |
application/marc | MARC | supported | |
image/jpeg | JPEG | jpeg, jpg | supported |
image/gif | GIF | gif | supported |
image/png | image/png | png | supported |
image/tiff | TIFF | tiff, tif | supported |
audio/x-aiff | AIFF | aiff, aif, aifc | supported |
audio/basic | audio/basic | au, snd | known |
audio/x-wav | WAV | wav | known |
video/mpeg | MPEG | mpeg, mpg, mpe | known |
text/richtext | RTF | rtf | supported |
application/vnd.visio | Microsoft Visio | vsd | known |
application/x-filemaker | FMP3 | fm | known |
image/x-ms-bmp | BMP | bmp | known |
application/x-photoshop | Photoshop | psd, pdd | known |
application/postscript | Postscript | ps, eps, ai | supported |
video/quicktime | Video Quicktime | mov, qt | known |
audio/x-mpeg | MPEG Audio | mpa, abs, mpega | known |
application/vnd.ms-project | Microsoft Project | mpp, mpx, mpd | known |
application/mathematica | Mathematica | ma | known |
application/x-latex | LateX | latex | known |
application/x-tex | TeX | tex | known |
application/x-dvi | TeX dvi | dvi | known |
application/sgml | SGML | sgm, sgml | known |
application/wordperfect5.1 | WordPerfect | wpd | known |
audio/x-pn-realaudio | RealAudio | ra, ram | known |
image/x-photo-cd | Photo CD | pcd | known |
We want to provide support for as many file formats as possible. Over time, items stored in DSpace will be preserved as is, using a combination of time-honored techniques for data management and best practices for digital preservation. As for specific formats, however, the proprietary nature of many file types makes it impossible to guarantee much more than this. Put simply, our policy for file formats is:
By "support", we mean "make useable in the future, using whatever combination of techniques (such as migration, emulation, etc.) is appropriate given the context of need". For supported formats, we might choose to bulk-transform files from a current format version to a future one, for instance. But we can't predict which services will be necessary down the road, so we'll continually monitor formats and techniques to ensure we can accomodate needs as they arise.
In the meantime, we can choose to "support" a format if we can gather enough documentation to capture how the format works. In particular, we collect file specifications, descriptions, and code samples, and make those available in the DSpace Format Reference Collection. Unfortunately, this means that proprietary formats for which these materials are not publicly available cannot be supported in DSpace. We will still preserve these files, and in cases where those formats are native to tools supported by MIT Information Systems, we will provide you with guidance on converting your files into formats we do support. It is also likely that for extremely popular but proprietary formats (such as Microsoft .doc, .xls, and .ppt), we will be able to help make files in those formats more useful in the future simply because their prevalence makes it likely tools will be available. Even so, we cannot guarantee this level of service without also having more information about the formats, so we will still list these formats as "known", not "supported".
We understand that there are always more formats to consider, and we would appreciate your help in identifying and studying the suitability of support for formats you care about. If we can't identify a format, DSpace will record it as "unknown", aka "application/octet-stream", but we would like to keep the percentage of supported format materials in DSpace as high as possible. Don't hesitate to contact us if you have any questions or concerns.