Accions
Extraccio de metadades Audio¶
Als fitxer .info hi guardem metadades tècniques del fitxers: md5, sha1, i l'extracció de metadades des de dos softwares, un d'específic i el Jhove. En el cas de les imatges, l'específic és Imagemagick, i per als PDFs, xpdf.
En el cas dels àudios (i segurament els vídeos) hem de triar-ne algun, i més perquè el Jhove no és capaç d'extreure res útil dels mp3, per exemple.
De les diferents opcions que he avaluat, m'ha semblat preferible hachoir-metadata, perquè dóna la informació més clara i detallada. En tot cas, en aquesta pàgina deixo apuntades les diferents opcions que he avaluat i el resultat que donen.
Jhove ==¶
$ jhove --help
Jhove (Rel. 1.4, 2009-07-30)
Date: 2011-04-28 12:49:46 CEST
App:
API: 1.2, 2007-05-10
Configuration: /etc/jhove/jhove.conf
JhoveHome: /users/stephen/projects/jhove
Encoding: utf-8
TempDirectory: /var/tmp
BufferSize: 131072
Module: AIFF-hul 1.3
Module: ASCII-hul 1.2
Module: BYTESTREAM 1.3
Module: GIF-hul 1.3
Module: HTML-hul 1.2
Module: JPEG-hul 1.2
Module: JPEG2000-hul 1.3
Module: PDF-hul 1.8
Module: TIFF-hul 1.5
Module: UTF8-hul 1.3
Module: WAVE-hul 1.3
Module: XML-hul 1.3
OutputHandler: Audit 1.1
OutputHandler: TEXT 1.4
OutputHandler: XML 1.5
Usage: java Jhove [-c config] [-m module] [-h handler] [-e encoding] [-H handler] [-o output]
[-x saxclass] [-t tempdir] [-b bufsize] [-l loglevel] [[-krs] dir-file-or-uri [...]]
Rights: Copyright 2004-2009 by the President and Fellows of Harvard College.
Released under the GNU Lesser General Public License.
$ jhove -k bbcsound_v29n4.mp3 Jhove (Rel. 1.4, 2009-07-30) Date: 2011-04-28 12:38:47 CEST RepresentationInformation: bbcsound_v29n4.mp3 ReportingModule: BYTESTREAM, Rel. 1.3 (2007-04-10) LastModified: 2003-03-17 17:11:15 CET Size: 3048973 Format: bytestream Status: Well-Formed and valid MIMEtype: application/octet-stream Checksum: fc6d7074 Type: CRC32 Checksum: 90eb77e67154df47b8ccf36a2a0afb35 Type: MD5 Checksum: 9b0bfbbbebd58213e6baf8e8e15ae530b1958d1f Type: SHA-1
hachoir-metadata¶
$ hachoir-metadata --help
Usage: hachoir-metadata [options] files
Options:
-h, --help show this help message and exit
Metadata:
Option of metadata extraction and display
--type Only display file type (description)
--mime Only display MIME type
--level=LEVEL Quantity of information to display from 1 to 9 (9 is
the maximum)
--raw Raw output
--bench Run benchmark
--parser-list List all parsers then exit
--profiler Run profiler
--version Display version and exit
--quality=QUALITY Information quality (0.0=fastest, 1.0=best, and default
is 0.5)
Hachoir library:
Configure Hachoir library
--verbose Verbose mode
--log=LOG Write log in a file
--quiet Quiet mode (don't display warning)
--debug Debug mode
$ hachoir-metadata bbcsound_v29n4.mp3 Metadata: - Title: Morocco, Cafe, Rabat, Audible Traffic, TV And Expresso Machine - Author: BBC - Album: BBC Sound Effects 29 - Africa- The Human World - Duration: 4 min 13 sec 387 ms - Music genre: Efectes de so - Track number: 4 - Channel: Joint stereo - Sample rate: 44.1 kHz - Bits/sample: 16 bits - Compression rate: 14.7x - Bit rate: 96.0 Kbit/sec (constant) - Format version: MPEG version 1 layer III - MIME type: audio/mpeg - Endian: Big endian
extract¶
$ extract --help
Usage: extract [OPTIONS] [FILENAME]*
Extract metadata from files.
Arguments mandatory for long options are also mandatory for short options.
-a, --all do not remove any duplicates
-b, --bibtex print output in bibtex format
-B, --binary=LANG use the generic plaintext extractor for the
language with the 2-letter language code LANG
-d, --duplicates remove duplicates only if types match
-f, --file[[[[name]]]] use the file[[[[name]]]] as a keyword (loads
file[[[[name]]]]-extractor plugin)
-g, --grep-friendly produce grep-friendly output (all results on one
line per file)
-h, --help print this help
-H, --hash=ALGORITHM compute hash using the given ALGORITHM (currently
sha1 or md5)
-l, --library=LIBRARY load an extractor plugin [[[[name]]]]d LIBRARY
-L, --list list all keyword types
-n, --nodefault do not use the default set of extractor plugins
-p, --print=TYPE print only keywords of the given TYPE (use -L to
get a list)
-r, --remove-duplicates remove duplicates even if keyword types do not
match
-s, --split use keyword splitting (loads split-extractor
plugin)
-v, --version print the version number
-V, --verbose be verbose
-x, --exclude=TYPE do not print keywords of the given TYPE
$ extract -f bbcsound_v29n4.mp3 duration - 4m14 format - MPEG-1 Layer III audio, 96 kbps (CBR), 44100 Hz, joint stereo, no copyright, original resource-type - MPEG-1 mimetype - audio/mpeg description - BBC: Morocco, Cafe, Rabat, Audible (BBC Sound Effects 29 - Africa-) track number - 4 album - BBC Sound Effects 29 - Africa- artist - BBC title - Morocco, Cafe, Rabat, Audible album - BBC Sound Effects 29 - Africa- The Human World track number - 04 content type - Efectes de so title - Morocco, Cafe, Rabat, Audible Traffic, TV And Expresso Machine filesize - 3.05 MB file[[[[name]]]] - bbcsound_v29n4.mp3
Actualitzat per fa quasi 15 anys · 0 revisions