From 7a1a4928f4f4f7786f30100c58f2b942227a9a49 Mon Sep 17 00:00:00 2001 From: Galen Charlton Date: Thu, 2 Aug 2012 11:28:42 -0400 Subject: [PATCH 1/1] trim excessive trailing whitespace from subfield contents If a subfield has too much (arbitrarily defined as at least 10) of trailing whitespace, trim the whitespace. This works around a problem applying certain stylesheets (like the MARCXML-to-MODS stylesheet) that use a recursive XSLT function to trim whitespace. Note that only "excessive" whitespace is trimmed; some systems emit subfields that contain semantically significant trailing whitespace in certain fields. Signed-off-by: Galen Charlton --- marc_cleanup | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/marc_cleanup b/marc_cleanup index c26525c..e0c20a7 100755 --- a/marc_cleanup +++ b/marc_cleanup @@ -193,6 +193,12 @@ sub do_automated_cleanups { message("Dollar sign corrected"); } + # excessive trailing whitespace in subfield contents + if ($record[$ptr] =~ m|\s{10,}|) { + $record[$ptr] =~ s|\s{10,}||; + message("Trailing whitespace trimmed from subfield contents"); + } + # automatable subfield maladies $record[$ptr] =~ s/code=" ">c/code="c">/; $record[$ptr] =~ s/code=" ">\$/code="c">\$/; -- 1.7.2.5