Altres convencions pels noms dels fitxers¶
Com anomenar els fitxers és un tema que ocupa (i preocupa) d'una manera o altra a la gent que gestiona dipòsits. En aquest pàgina recollim documentació sobre estàndards i/o bones pràctiques d'altres que ens puguin il·luminar.
Missatges a la llista digipres¶
Reprodueixo aquí alguns missatges (amb links) sobre aquest tema a la llista de preservació digital digipres. Veureu que hi intervenen els grans noms de les biblioteques nord-americanes.
From: Ann Marie Willer <amwillerala@yahoo.com> Subject: [digipres] file naming conventions? To: digipres@ala.org, padg@ala.org Date: Wed, 09 Jul 2008 14:22:16 -0700 (PDT) X-Mailer: YahooMailRC/1042.33 YahooMailWebService/0.7.199 Colleagues, I am involved in discussions about file naming conventions for the products of digitization projects. Could you (1) recommend guidelines recently published or posted and/or (2) share what you do at your institution? If I've missed a previous discussion, please let me know, and I will consult the archives as well. Thanks, Ann Marie Ann Marie Willer Preservation Services Librarian Massachusetts Institute of Technology 77 Massachusetts Ave. Building 14-0513 Cambridge, MA 02139 617-253-5692 phone Send ALA business to: AMWillerALA@yahoo.com
From: Jessica Branco Colati <jessica@coalliance.org> Subject: [digipres] RE: file naming conventions? To: 'Ann Marie Willer' <amwillerala@yahoo.com>, digipres@ala.org, padg@ala.org Date: Wed, 09 Jul 2008 15:34:55 -0600 X-Mailer: Microsoft Office Outlook 12.0 Hi Ann Marie, Members of our consortium have used CDP's Imaging guidelines when looking at file-naming conventions, both historically and within the context of reviewing the recent release of a new version late last month: http://www.bcr.org/publications/bcreview/2008/06/digital-imaging-best-practices-ver2.html We try to accommodate *any* file-naming conventions in practice at our members? institution, but from a software perspective, we've had some difficulty with '.' (dots/periods) used in filenames other than to delineate the file extension, and have had better luck with '_' (underscores) when provided by our members (i.e. MS01.0001.00001.tif vs MS01_001_00001.tif, seems to be more code-friendly??) Best, Jessica Jessica Branco Colati Project Director Alliance Digital Repository Colorado Alliance of Research Libraries 3801 E. Florida Ave., Suite 515 Denver, CO 80210 t: (303) 759-3399 x113 f: (303) 759-3363 e: jessica@coalliance.org w: http://www.coalliance.org
From: Liz Madden <emad@loc.gov>
Subject: [digipres] Re: RE: file naming conventions?
To: digipres@ala.org, padg@ala.org, jessica@coalliance.org,
amwillerala@yahoo.com
Date: Thu, 10 Jul 2008 09:14:36 -0400
X-Mailer: Novell GroupWise Internet Agent 7.0.2 HP
In addition to avoiding the dot and other punctuation marks and
spaces, we've discovered it's useful to stay away from characters that
Windows or UNIX or other platforms use as special characters. Here's a
list of ones that may cause problems:
< > : " / | ? *
Avoiding these is especially important in the event that you need to
transfer your content elsewhere in the future (e.g., to another
repository/institution/system that uses a different platform), where
the file name could be misinterpreted by the new system because of the
characters in it.
--Liz
*******************************
Liz Madden
Digital Media Projects Coordinator
Office of Strategic Initiatives
Library of Congress
101 Independence Ave SE
Washington, DC 20540
emad@loc.gov
phone: 202-707-4578
*******************************
From: "Walls, David" <david.walls@yale.edu> Subject: [digipres] File Naming Conventions? To: "digipres@ala.org" <digipres@ala.org> Date: Thu, 10 Jul 2008 10:26:24 -0400 Ann Marie If there are guidelines around for file naming conventions, I haven't been able to find anything that offers more than the most basic suggestions. My advice is to not to try to make up a naming convention, but to use the bibliographic record identification number for the specific resource to be scanned that is found in the MARC record for the title in your OPAC. Most of the materials that we are digitally reformatting are cataloged in our OPAC. Call numbers can change, several books can have the same title, and using truncated titles for file names frequently don't offer much information. The bibliographic record number is unique, does not change, and we use this as the persistent identifier for the files. Also, data from OPACs already have a fairly reliable track record of being migrated into the future. In our OPAC, the bibliographic record number is a six digit number. When we send materials to be scanned, we also send the vendor an Excel spreadsheet that includes the bibliographic record number, the title, and other information. The vendor returns the digital files of the materials scanned on a portable USB hard drive. The drive contains a series of folders all named by the six digit bibliographic id number. Inside each of the folders are the master, derivative, and metadata files. For example, the parent folder would be named 123456 or whatever the actual number is. Inside the parent folder are four other folders named 123456.tif or 123456.jp2 depending on what we've chosen for the master file. The other folders are 123456.pdf and 123456.xml. Please let me know if you have other questions. David Walls Preservation Librarian, Yale University Library. Head, Reformatting and Media Preservation.
From: Robert Dowd <RDOWD@MAIL.NYSED.GOV> Subject: [digipres] Re: File Naming Conventions? To: "digipres@ala.org" <digipres@ala.org> Date: Thu, 10 Jul 2008 10:43:21 -0400 X-Mailer: Novell GroupWise Internet Agent 7.0.2 HP The New York State Library employs such a convention, with OCLC number or local control number being the major portion of most file names (not all items imaged have been cataloged). Some of our larger items are scanned in parts, some imaging equipment saves raw scans at one file per page (later combined for a use copy), and some of our imaged titles are multi-part sets or serials. And so while the bib record identification number is a good start, we necessarily create file names that may include Volume, Number, Year, Month, Day, Part, Page, etc. Previous cautions about use of 'underscore' and other standard characters all play into that. Bob Dowd Senior Librarian Documents Section New York State Library Albany, NY 12230
From: Nancy <nmccrave@rochester.rr.com> Subject: [digipres] RE: file naming conventions? To: Ann Marie Willer <amwillerala@yahoo.com>, digipres@ala.org, padg@ala.org Date: Thu, 10 Jul 2008 19:55:33 -0400 X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) When I was researching file naming some time ago, I had bookmarked these pages (I found the first to be particularly helpful): http://wiki.dlib.indiana.edu/confluence/display/INF/Filename+Requirements+for+Digital+Objects http://www.archives.gov/preservation/technical/guidelines.pdf (see page 60) http://www.controlledvocabulary.com/imagedatabases/filename_limits.html http://edocs.lib.sfu.ca/projects/Doukhobor-Collection/technical.html http://staffweb.library.northwestern.edu/dl/adhocdigitization/storage/ Hope this helps. Nancy McCrave
From: "Casey, Michael T" <micasey@indiana.edu> Subject: [digipres] RE: RE: file naming conventions? To: "digipres@ala.org" <digipres@ala.org>, "padg@ala.org" <padg@ala.org> Date: Fri, 11 Jul 2008 09:14:55 -0400 The Archives of Traditional Music updated its file naming scheme in 2006, working with our Digital Library Program which was simultaneously developing the recommendations presented by the first link in Nancy's message, below. You can see our implementation for audio files in chapter 3 of the publication Sound Directions: Best Practices for Audio Preservation, available at http://www.dlib.indiana.edu/projects/sounddirections/papersPresent/index.shtml Mike Casey -- Mike Casey Associate Director for Recording Services Archives of Traditional Music Indiana University (812)855-8090 Co-Chair, ARSC Technical Committee
From: Liz Bishoff <lbishoff@BCR.ORG> Subject: [digipres] RE: RE: RE: file naming conventions? To: digipres@ala.org, padg@ala.org Date: Fri, 11 Jul 2008 10:38:26 -0600 The BCR-CDP Digital Imaging Best Practices version 2.0 has just been published and it also has a section on naming conventions. So I think there is now plenty of options for those interested. You can find it at http://www.bcr.org/cdp/best/index.html Liz Bishoff, Director, Digital and Preservation Services BCR 14394 E. Evans Aurora CO 80014 lbishoff@bcr.org
From: Ingrid Mason <Ingrid.Mason@vuw.ac.nz> Subject: [digipres] RE: file naming conventions? To: Ann Marie Willer <amwillerala@yahoo.com>, digipres@ala.org, padg@ala.org Date: Tue, 15 Jul 2008 14:40:44 +1200 Hi Ann, I don't see anyone mentioning this, so I figure I may as well offer this as a side issue. I understand the need to define how files are to be named, particularly in such ways that won't create system issues (characters and spaces). But, having 'intelligence' built into filenames, by using naming or system based alpha-numeric arrangement strikes me as slightly worrying. Why? Simply because I'm hoping that there is metadata associated with the digital file that enables it to be identified and retrieved; not using the 'information' in the filename in a meaningful way. We have purposefully used 'dumb' or 'generic' filenames in loading digital material into the research repository, e.g. thesis.pdf; paper.pdf; form.pdf; report.pdf, etc. I expect the metadata that the object is associated with to enable information and object retrieval. However, in saying that, the filenames we use are also a means to guide/remind users of the type of file that they are downloading. That in itself is 'doubling' up the load on the filename to act also as a resource type label. However, if all the filenames change in a preservation migration or transformation, we have metadata associated with digital object to identify the resource type. I hope this thought/reminder is useful. Cheers, Ingrid Ingrid Mason Digital Research Repository Coordinator ResearchArchive@Victoria Victoria University of Wellington ph: 64-4-463 6844 em: ingrid.mason@vuw.ac.nz Location: Kelburn Campus, Rankine Brown, RB501A
From: Bruce Gordon <bgordon@fas.harvard.edu> Subject: [digipres] Re: RE: file naming conventions? To: Ann Marie Willer <amwillerala@yahoo.com> Cc: digipres@ala.org, padg@ala.org Date: Mon, 14 Jul 2008 23:19:47 -0400 X-Mailer: Apple Mail (2.926) Hello Ann and Ingrid, Digital preservation is not dependent upon the file name being anything but unique. Therefore a simple number string will suffice as long as metadata is linked to the file. That said, there is a lot of value in having human readable names that convey information about the file such as catalog number, role, sequence number, etc. These things make the actual preservation workflow easier to follow and de-bug in case of problems. In our workflow we use filenames that incorporate the call number, volume number, preservation role, face number, and file sequence number. Upon ingestion into the digital repository, this human-readable name is stored in metadata, and the file is named by the repository automatically with a unique number string which is more efficient for data processing. Upon retrieval from the digital repository, the human readable name may be restored from the metadata so that humans can work with the file without confusion. Consistency and uniqueness are most important, regardless of the method used. Best, -Bruce Bruce J. Gordon Audio Engineer Eda Kuhn Loeb Music Library Harvard University Cambridge, Massachusetts 02138 U.S.A
From: Trudy Levy <Trudy@dig-mar.com> Subject: [digipres] Re: RE: file naming conventions? To: digipres@ala.org Date: Sat, 19 Jul 2008 13:38:07 -0700 I am glad that management systems have reached a state of development where links to objects never become broken. Harking from an earlier time when this did occur, I always encourage my clients to develop a alphanumerical code, such as Bruce is describing, which gives them some hint of the original object's identity. In thinking of joining a larger collection down the line, I also encourage that they identify location/ownership of original object. In the California Local Historical Digital Resource Project, which is residjng with the CDL, we are using the codes derived the OCLC ID codes to identify each library. For this project, we are embedding metadata in the TIFF header some descriptive metadata - title, owner, scanning vendor, ICC profile - for identifying purposes. Yours Trudy -- Trudy Levy Digital Transition Consultant Image Integration 415 750 1274 http://www.DIG-Mar.com Images are information - Manage them
Altres estàndards o documents¶
Una proposta intrigant és la Pairtrees for Object Storage (http://www.cdlib.org/inside/diglib/pairtree/pairtreespec.html) que es troba a web de la California Digital Library. Sembla molt ben pensat però no acabo d'entendre del tot com aconsegueuxen les avantatges que diuen que té.
A la web de HathiTrust (http://www.hathitrust.org/) un dipòsit digital cooperatiu de les grans universitats nord-americanes hi ha un enllaç a les University of Michigan Digitization Specifications (http://www.lib.umich.edu/lit/dlps/dcsUMichDigitizationSpecifications20070501.pdf). A partir de la pàgina 7 hi ha els requisits dels noms de directoris i fitxers.
Actualitzat per Ferran Jorba fa quasi 14 anys · 2 revisions