Accions
On guardar tots aquests fitxers¶
Vegeu també: GestionarFitxersViaWeb i HistoricsIRepliquesAmbGit
Els fitxers TIFFs els guardem en un SataBeast
Fa poc vaig fer una pregunta a una llista de biblioteques i programari lliure, i us adjunto la meva pregunta i les respostes. Així tot(e)s hi teniu accés.
En tot cas, no té una resposta fàcil.
Molt ben fet! Aquest és un tema que cal posar en marxa urgentment. NuriaGallart
El JoseManuelCastillo i jo hem estat investigant sobre diferents opcions de hardware i software. En les següents pàgines anirem recopilant el que anem descobrint:
- QuantsTerabytes
- LiniesGeneralsDeLaPreservacio
- Hardware
- Software
Subject: [oss4lib-discuss] Storing and keeping safe those huge digitalisation files Date: Thu, 25 Jan 2007 16:33:30 +0100 From: Ferran Jorba <Ferran.Jorba@uab.es> Organization: Universitat Autònoma de Barcelona To: OSS4LIB (E-mail) <oss4lib-discuss@lists.sourceforge.net> Hello, I'm going to ask help for something I'd say it is a common situation nowadays mostly everywhere. My university is engaged in digitalisation of old material, like most do. So do some of my neighbour univesrities. Our libraries belong to a local consortium, like most libraries do. This digitalized material means, among other things, lots of fat TIFF files, totaling a huge amount of Gigabytes. I think most of you know that. There are plenty disk array vendors willing to sell you their solutions. I like specially Capricorn Tech (http://www.capricorn-tech.com/), due to their Archive.org pedigree, and Copan Systems (http://www.copansys.com/) for their MAID concept. But one of our most urgent problems is keeping those original TIFF (and their corresponding PDFs) safe beyond just storing them somewhere: I mean having more than one copy, doing backups, veryfiying checksums, automatically fixing the damaged files, maybe changing formats, etc. This second part is already invented, and it is called LOCKSS (http://www.lockss.org). It is a software with anything I could ask for, and more. What I have been unable to find in the LOCKSS site is a configuration model where some libraries, in a local consortium, join together to keep jointly this material. Ok, I understand that LOCKSS is designed to keep the material of [external] publishers. But when I first learned about CLOCKSS (Controlled LOCKSS) I immediately thought that it would address the scenario we are facing in our consortium. However, I cannot find it in their web pages. May I ask how are you addressing this scenario? If there is a better forum for this question, I'd gladly ask it there again. Thanks, Ferran
Subject: Re: [oss4lib-discuss] Storing and keeping safe those huge digitalisation files Date: Thu, 25 Jan 2007 10:21:04 -0600 From: Beth Nicol <nicollb@auburn.edu> To: Ferran Jorba <Ferran.Jorba@uab.es> References: <45B8CDCA.2020406@uab.es> Ferran: I do believe that CLOCKSS or a private LOCKSS network is what you are looking for. If you contact the LOCKSS folks, I think they can help you out. I work with 2 projects that are doing what you seem to want to do: the MetaArchive of Southern Digital Culture (which is a part of the Library of Congress's National Digital Information and Infrastructure Preservation Project, aka NDIIPP) and another project in Alabama which is just starting up. The MetaArchive group is offering a workshop on setting up these types of networks. Information is online at http://www.metascholar.org/events/2007/ddp/ Essentially, these projects (an others) are using private LOCKSS networks to create a dark archive with the caches somewhat geographically dispersed. Does this help? Beth Nicol <nicollb@auburn.edu <mailto:nicollb@auburn.edu>> Information Technology Specialist Auburn University Libraries (334)844-1731
Subject: Re: [oss4lib-discuss] Storing and keeping safe those huge digitalisation files
Date: Thu, 25 Jan 2007 11:33:13 -0500 (EST)
From: Joe Hourcle <oneiros@grace.nascom.nasa.gov>
To: Ferran Jorba <Ferran.Jorba@uab.es>
CC: OSS4LIB (E-mail) <oss4lib-discuss@lists.sourceforge.net>
References: <45B8CDCA.2020406@uab.es>
On Thu, 25 Jan 2007, Ferran Jorba wrote:
> Hello,
>
> I'm going to ask help for something I'd say it is a common situation
> nowadays mostly everywhere.
>
> My university is engaged in digitalisation of old material, like most
> do. So do some of my neighbour univesrities. Our libraries belong to
> a local consortium, like most libraries do. This digitalized material
> means, among other things, lots of fat TIFF files, totaling a huge
Could you give us an idea of how big the archive is, and how fast it's
expected to grow?
For smaller repositories, we're currently using Apple XServe RAID. For
stuff that's multiple terrabytes, we're using hardware from Pillar Data
Systems <http://www.pillardata.com/>. I know for our last big purchase,
we had looked at Network Appliance <http://www.netapp.com/>, but I wasn't
involved in the purchase decision, so I don't know what the determining
factors were (performance, features, support, price, etc.), or what other
vendors were evaluated.
[trimmed]
> What I have been unable to find in the LOCKSS site is a configuration
> model where some libraries, in a local consortium, join together to
> keep jointly this material. Ok, I understand that LOCKSS is designed
> to keep the material of [external] publishers. But when I first
> learned about CLOCKSS (Controlled LOCKSS) I immediately thought that
> it would address the scenario we are facing in our consortium. However,
> I cannot find it in their web pages.
http://www.lockss.org/clockss/Home
(see the left-nav)
http://www.lockss.org/clockss/FAQ
Where does the initiative currently stand?
The initiative, which began early in 2006, is implementing and
evaluating both social and technical models over a two-year
period. During this time the initiative will work to build a
full-scale production system using a significant portion of the
content of the publisher members. The work of the initiative is
transparent and will be independently assessed, with all findings
reported to the wider community.
Joe Hourcle
Subject: Re: [oss4lib-discuss] Storing and keeping safe those huge digitalisation files Date: Sat, 27 Jan 2007 15:06:22 +0930 From: Stephen De Gabrielle <spdegabrielle@gmail.com> To: OSS4LIB (E-mail) <oss4lib-discuss@lists.sourceforge.net> CC: Ferran Jorba <Ferran.Jorba@uab.es> References: <45B8CDCA.2020406@uab.es> <26C6B0CCB6892843849BE72624C9D12E1745C7@medusa.library.arizona.edu> Hi I just thought I'd point out some useful resources; This message talks about making LOCKSS work with Digital repository software to create a private LOCKSS network. http://www.sfu.ca/~hgmorris/openaccess-archiving/msg00006.html Hopefully this helps, as the Repository software, is getting pretty good at supporting the digital preservation work you mention in you original email. Also the China Digital Museum Project paper 'Building a Distributed, Standards-based Repository Federation' talks a fair bit about how they handled replication and naming of metadata and content using the DSpace Digital Repository software. -- http://www.dlib.org/dlib/july06/tansley/07tansley.html The RLG has some nice work that you may find interesting: Attributes of Trusted Digital Repositories - http://www.rlg.org/en/page.php?Page_ID=583 Audit Checklist for Certifying Digital Repositories (Draft - but good enough for others[MAGDIR] to consider it as a model) -- http://www.rlg.org/en/page.php?Page_ID=20769 Cheers, Stephen De Gabrielle
Subject: Re: [oss4lib-discuss] oss4lib-discuss Digest, Vol 8, Issue 6 Date: Thu, 25 Jan 2007 15:54:51 -0500 From: Bosman, Don <dbosman@mail.lib.msu.edu> To: oss4lib-discuss@lists.sourceforge.net I trimmed a lot out for brevity. For archiving, and for day to day "homes" folder usage Libraries, Michigan State University is using Apple's XServe RAID SAN solutions using Fibre Channel connections to our main servers. We currently have two XServe setups. One large mirrored set in our main library and an off site (in a branch) unit. We mirror every night and backup to the off site unit on the weekend. Getting started was a bit rocky as we were one of the first institutional installations for Apple and they weren't quite finished with the software. The last couple of terabyte expansions were relatively painless. I think we are at ten to twelve terabyte at this time. I must add that we do not do "live" manipulations on the SAN system. Using Photoshop to crop, rotate, tweak the color, etc, on a all the pages in a scanned journal or newspaper can slow the building network. Files are scanned or Bookeye'd to local drives - manipulated as needed for archiving then batch copied to the SAN in the evening. This type of buffering has works for us. I don't know what the future will bring or need, but we are happy with the system we have in place at this time. Don Bosman Information Technologist Libraries, Michigan State University 100 Library East Lansing, MI 48824-1048 dbosman@mail.lib.msu.edu (517) 432-6123 ext 233 Fax (517) 432-8374
Subject: Re: [oss4lib-discuss] Storing and keeping safe those huge digitalisation files Date: Fri, 26 Jan 2007 15:55:25 +0100 From: Ferran Jorba <Ferran.Jorba@uab.es> Organization: Universitat Autònoma de Barcelona CC: OSS4LIB (E-mail) <oss4lib-discuss@lists.sourceforge.net> References: <45B8CDCA.2020406@uab.es> <45B884A2.9A00.00D8.0@auburn.edu> Thank you all for your responses. [...] > Essentially, these projects (an others) are using private LOCKSS > networks to create a dark archive with the caches somewhat > geographically dispersed. Beth's reply about Auburn participating in a private LOCKSS network has given me hope. I've followed your suggestion and I've filled contacted the LOCKSS people at http://www.lockss.org/clockss/Talkback Answering some of your other questions, I still don't know the size, because I have (partial) information about my own library, but less from the others. Several Terabytes for sure, but again, I know that this is too vague. > Does this help? Sure it does. Thanks again, Ferran
Subject: Re: [oss4lib-discuss] Storing and keeping safe those huge digitalisation files Date: Fri, 26 Jan 2007 10:38:27 -0600 From: Beth Nicol <nicollb@auburn.edu> To: Ferran Jorba <Ferran.Jorba@uab.es> References: <45B8CDCA.2020406@uab.es> <45B884A2.9A00.00D8.0@auburn.edu> <45BA165D.5030704@uab.es> Ferran: This is off-list, but, you can contact troberts@stanford.edu <mailto:troberts@stanford.edu> directly about the Private LOCKSS networks. I talked with him yesterday, and he can either answer your questions or get you hooked up with the right folks. You can tell him I referred you. Beth Nicol <nicollb@auburn.edu <mailto:nicollb@auburn.edu>> Information Technology Specialist Auburn University Libraries (334)844-1731
Subject: Re: [oss4lib-discuss] Storing and keeping safe those huge digitalisation files Date: Fri, 26 Jan 2007 14:09:53 -0700 From: Han, Yan <hany@u.library.arizona.edu> To: Ferran Jorba <Ferran.Jorba@uab.es>, "OSS4LIB (E-mail)" <oss4lib-discuss@lists.sourceforge.net> References: <45B8CDCA.2020406@uab.es> This is not an easy answer for your questions. my understanding of LOCKSS is that it does not work for straight TIFF /PDFs. There are organizations who can take care of your problems. OCLC is testing the idea of preservation. There is also a research project going on with NDIIPP project, which has a consortium to do digital preservation. (Emory U. is one of the partners). Or you can just buy some hard drives/tapes and save multiple copies in off-site storage. but in this case, you are responsible for the migration of formats etc. I like the idea of consortium preservation, but there are other issues to be sorted out. Yan Han The University of Arizona Libraries
Subject: Re: [oss4lib-discuss] Storing and keeping safe those huge digitalisation files Date: Mon, 29 Jan 2007 08:30:40 -0600 From: Beth Nicol <nicollb@auburn.edu> To: OSS4LIB (E-mail) <oss4lib-discuss@lists.sourceforge.net>, Yan Han <hany@u.library.arizona.edu>, Ferran Jorba <Ferran.Jorba@uab.es> References: <45B8CDCA.2020406@uab.es> <26C6B0CCB6892843849BE72624C9D12E1745C7@medusa.library.arizona.edu> I'm not sure what you mean by "it does not work for straight TIFF/PDF's" -- you must organize your files into Archival Units, and create a manifest page just as you would for a journal. I've harvested several GB's of tiff's using LOCKSS. Beth Nicol <nicollb@auburn.edu <mailto:nicollb@auburn.edu>> Information Technology Specialist Auburn University Libraries (334)844-1731
Subject: Re: [oss4lib-discuss] Storing and keeping safe those huge digitalisation files Date: Mon, 29 Jan 2007 11:15:16 -0700 From: Han, Yan <hany@u.library.arizona.edu> To: Beth Nicol <nicollb@auburn.edu>, "OSS4LIB (E-mail)" <oss4lib-discuss@lists.sourceforge.net>, Ferran Jorba <Ferran.Jorba@uab.es> References: <45B8CDCA.2020406@uab.es> <26C6B0CCB6892843849BE72624C9D12E1745C7@medusa.library.arizona.edu> <45BDB0B6.9A00.00D8.0@auburn.edu> Beth, Thanks for point out. The algorithem in LOCKSS is to use peers in the network to preserve/restore/repair files (by voting with the majority). My question is: if I have a MD5/SHA signiture, I know if the file is authenticated. Why do I need a vote? As your library is a member of MetaArchive, could you explain more about how you handle the digitial signiture? do you do any modification of the LOCKSS source code? what about the cost? (in this case, I assume that you are using PC as a storage unit. the cost should be lower). Yan
Subject: Universitat Autònoma de Barcelona Date: Fri, 26 Jan 2007 10:07:38 -0800 From: Victoria Reich <vreich@stanford.edu> Reply-To: vreich@stanford.edu To: Ferran.Jorba@uab.es, Victoria Reich <vreich@stanford.edu> Dear Ferran, We are very pleased that you are interested in the LOCKSS and CLOCKSS Programs. For your application, you will want to use the LOCKSS software and you will wish to set up a Private LOCKSS Network. The LOCKSS system can be used to preserve many TB of of web based content that the library holds. If you cooperate with other libraries in Spain -- you can inexpensively build a very robust, distributed preservation network. This is not hard to do, we support libraries who join the LOCKSS Alliance to do this. Before going further, I strongly suggest you bring a LOCKSS box online. This first hand experience is the best and easiest way to learn how LOCKSS works. The instructions for installing a LOCKSS box are here. http://www.lockss.org/lockss/Installing_LOCKSS To bring a LOCKSS box online is free and you are welcome to send us email if you have questions. Sincerely, Victoria Reich Director LOCKSS Program Stanford University Libraries 011.650.725.1134 www.lockss.org Libraries are using LOCKSS to build local libraries! www.lockss.org CLOCKSS, a collaborative community archive www.lockss.org/clockss ---- It is not clear to me, from reading your site, that it can work in this scenario: I work for a University Library where we have to store for the long time digital material where, most of the time, there is no publisher involved: either because they are personal archives, or old periodicals with no live editor, etc. Other universities around ours have similar projects. Reading your CLOCKSS pages, I see references to those editors that distract me. What I'm currently seeking is advice whether CLOCKSS can be used to do a (mostly) unattended backups, recoveries, etc. for this large archives (several Terabytes, not determined yet). I have been addressed here by people involved in to http://www.metascholar.org/events/2007/ddp/ .
Subject: Re: Universitat Autònoma de Barcelona Date: Mon, 29 Jan 2007 10:17:35 +0100 From: Ferran Jorba <Ferran.Jorba@uab.es> Organization: Universitat Autònoma de Barcelona To: vreich@stanford.edu References: <45BA436A.6040909@stanford.edu> Hello Victoria, thank you for your fast response. I'll follow your suggestion and I'll try to bring a LOCKSS box myself, and see what I learn. How do you suggest me to proceed if I have more detailed questions? I googled for a CLOCKSS mailint list and I didn't find it; not even a LOCKSS mailing list. May I ask you for a suitable forum, or a contact person? Thanks again, Ferran
Actualitzat per Ferran Jorba fa quasi 14 anys · 1 revisions