utility script to prepare a file of MARCXML records for fingerprinter
authorGalen Charlton <gmc@esilibrary.com>
Mon, 30 Jul 2012 17:42:23 +0000 (13:42 -0400)
committerGalen Charlton <gmc@esilibrary.com>
Mon, 30 Jul 2012 17:42:23 +0000 (13:42 -0400)
Given a two-column tab-delimited text file contain bib IDs and MARCXML, produces
a MARCXML file with the bib IDs in 903 fields.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>

munge_marc_export_for_fingerprint.pl [new file with mode: 0755]

diff --git a/munge_marc_export_for_fingerprint.pl b/munge_marc_export_for_fingerprint.pl
new file mode 100755 (executable)
index 0000000..e07c60d
--- /dev/null
@@ -0,0 +1,31 @@
+#!/usr/bin/perl
+
+# Copyright 2009-2012, Equinox Software, Inc.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version 2
+# of the License, or (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+
+# Utility script to prepare a file of MARCXML records extracted from an Evergreen
+# database for fingerprinter by adding 903 fields.  Usage:
+#  echo "select id || chr(9) || REGEXP_REPLACE(marc, E'\\n','','g') from biblio.record_entry where not deleted and id < $BIBIDSTART" > $BIN/incumbent_bibs.sql 
+#  psql -A -t -U $DBUSER < $BIN/incumbent_bibs.sql | munge_marc_export_for_fingerprint.pl > $INTER/incumbent.mrc
+
+while (<>) {
+    my ($id, $rest) = split /\t/, $_, 2;
+    $rest =~ s!<datafield .*?tag="903".*?</datafield>!!g;
+    $rest =~ s!</record>!<datafield tag="903"><subfield code="a">$id</subfield></datafield></record>!;
+    $rest =~ s!</marc:record>!<datafield tag="903"><subfield code="a">$id</subfield></datafield></marc:record>!;
+    print $rest;
+}
+