+++ /dev/null
-<?xml version="1.0" encoding="UTF-8"?>\r
-<chapter xml:id="intro_to_sql" xmlns="http://docbook.org/ns/docbook" version="5.0" xml:lang="EN"\r
- xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xlink="http://www.w3.org/1999/xlink">\r
- <chapterinfo>\r
- <title>Introduction to SQL for Evergreen Administrators</title>\r
- </chapterinfo>\r
- <abstract id="itnroSQL_abstract">\r
- <simpara>This chapter was taken from Dan Scott's <emphasis>Introduction to SQL for Evergreen Administrators</emphasis>, February 2010.</simpara>\r
- </abstract> \r
- <section id="intro_to_databases">\r
- <title>Introduction to SQL Databases</title>\r
- <indexterm><primary>sql</primary></indexterm>\r
- <simplesect>\r
- <title>Introduction</title>\r
- <simpara>Over time, the SQL database has become the standard method of storing,\r
- retrieving, and processing raw data for applications. Ranging from embedded\r
- databases such as SQLite and Apache Derby, to enterprise databases such as\r
- Oracle and IBM DB2, any SQL database offers basic advantages to application\r
- developers such as standard interfaces (Structured Query Language (SQL), Java\r
- Database Connectivity (JDBC), Open Database Connectivity (ODBC), Perl Database\r
- Independent Interface (DBI)), a standard conceptual model of data (tables,\r
- fields, relationships, constraints, etc), performance in storing and retrieving\r
- data, concurrent access, etc.</simpara>\r
- <simpara>Evergreen is built on PostgreSQL, an open source SQL database that began as\r
- <literal>POSTGRES</literal> at the University of California at Berkeley in 1986 as a research\r
- project led by Professor Michael Stonebraker. A SQL interface was added to a\r
- fork of the original POSTGRES Berkelely code in 1994, and in 1996 the project\r
- was renamed PostgreSQL.</simpara>\r
- </simplesect>\r
- <simplesect id="_tables">\r
- <title>Tables</title>\r
- <indexterm><primary>sql</primary><secondary>tables</secondary></indexterm>\r
- <simpara>The table is the cornerstone of a SQL database. Conceptually, a database table\r
- is similar to a single sheet in a spreadsheet: every table has one or more\r
- columns, with each row in the table containing values for each column. Each\r
- column in a table defines an attribute corresponding to a particular data type.</simpara>\r
- <simpara>We’ll insert a row into a table, then display the resulting contents. Don’t\r
- worry if the INSERT statement is completely unfamiliar, we’ll talk more about\r
- the syntax of the insert statement later.</simpara>\r
- <formalpara><title><literal>actor.usr_note</literal> database table</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-evergreen=# INSERT INTO actor.usr_note (usr, creator, pub, title, value)\r
- VALUES (1, 1, TRUE, 'Who is this guy?', 'He''s the administrator!');\r
-\r
-evergreen=# select id, usr, creator, pub, title, value from actor.usr_note;\r
- id | usr | creator | pub | title | value\r
-----+-----+---------+-----+------------------+-------------------------\r
- 1 | 1 | 1 | t | Who is this guy? | He's the administrator!\r
-(1 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>PostgreSQL supports table inheritance, which lets you define tables that\r
- inherit the column definitions of a given parent table. A search of the data in\r
- the parent table includes the data in the child tables. Evergreen uses table\r
- inheritance: for example, the <literal>action.circulation</literal> table is a child of the\r
- <literal>money.billable_xact</literal> table, and the <literal>money.*_payment</literal> tables all inherit from\r
- the <literal>money.payment</literal> parent table.</simpara>\r
- </simplesect>\r
- <simplesect id="_schemas">\r
- <title>Schemas</title>\r
- <simpara>PostgreSQL, like most SQL databases, supports the use of schema names to group\r
- collections of tables and other database objects together. You might think of\r
- schemas as namespaces if you’re a programmer; or you might think of the schema\r
- / table / column relationship like the area code / exchange / local number\r
- structure of a telephone number.</simpara>\r
- <table\r
- frame="all"\r
- rowsep="1" colsep="1"\r
- >\r
- <title>Examples: database object names</title>\r
- <?dbhtml table-width="100%"?>\r
- <?dbfo table-width="100%"?>\r
- <tgroup cols="4">\r
- <colspec colname="col_1" colwidth="2.0*"/>\r
- <colspec colname="col_2" colwidth="1.0*"/>\r
- <colspec colname="col_3" colwidth="1.0*"/>\r
- <colspec colname="col_4" colwidth="1.0*"/>\r
- <thead>\r
- <row>\r
- <entry align="left" valign="top">Full name </entry>\r
- <entry align="left" valign="top">Schema name </entry>\r
- <entry align="left" valign="top">Table name </entry>\r
- <entry align="left" valign="top">Field name</entry>\r
- </row>\r
- </thead>\r
- <tbody>\r
- <row>\r
- <entry align="left" valign="top"><simpara>actor.usr_note.title</simpara></entry>\r
- <entry align="left" valign="top"><simpara>actor</simpara></entry>\r
- <entry align="left" valign="top"><simpara>usr_note</simpara></entry>\r
- <entry align="left" valign="top"><simpara>title</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara>biblio.record_entry.marc</simpara></entry>\r
- <entry align="left" valign="top"><simpara>biblio</simpara></entry>\r
- <entry align="left" valign="top"><simpara>record_entry</simpara></entry>\r
- <entry align="left" valign="top"><simpara>marc</simpara></entry>\r
- </row>\r
- </tbody>\r
- </tgroup>\r
- </table>\r
- <simpara>The default schema name in PostgreSQL is <literal>public</literal>, so if you do not specify a\r
- schema name when creating or accessing a database object, PostgreSQL will use\r
- the <literal>public</literal> schema. As a result, you might not find the object that you’re\r
- looking for if you don’t use the appropriate schema.</simpara>\r
- <formalpara><title>Example: Creating a table without a specific schema</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-evergreen=# CREATE TABLE foobar (foo TEXT, bar TEXT);\r
-CREATE TABLE\r
-evergreen=# \d foobar\r
- Table "public.foobar"\r
- Column | Type | Modifiers\r
---------+------+-----------\r
- foo | text |\r
- bar | text |\r
-</programlisting>\r
- </para></formalpara>\r
- <formalpara><title>Example: Trying to access a unqualified table outside of the public schema</title><para>\r
- <programlisting language="sql" linenumbering="unnumbered">evergreen=# SELECT * FROM usr_note;\r
- ERROR: relation "usr_note" does not exist\r
- LINE 1: SELECT * FROM usr_note;\r
- ^</programlisting>\r
- </para></formalpara>\r
- <simpara>Evergreen uses schemas to organize all of its tables with mostly intuitive,\r
- if short, schema names. Here’s the current (as of 2010-01-03) list of schemas\r
- used by Evergreen:</simpara>\r
- <table\r
- frame="all"\r
- rowsep="1" colsep="1"\r
- >\r
- <title>Evergreen schema names</title>\r
- <?dbhtml table-width="80%"?>\r
- <?dbfo table-width="80%"?>\r
- <tgroup cols="2">\r
- <colspec colname="col_1" colwidth="1.0*"/>\r
- <colspec colname="col_2" colwidth="1.0*"/>\r
- <thead>\r
- <row>\r
- <entry align="left" valign="top">Schema name </entry>\r
- <entry align="left" valign="top">Description</entry>\r
- </row>\r
- </thead>\r
- <tbody>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>acq</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Acquisitions</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>action</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Circulation actions</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>action_trigger</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Event mechanisms</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>actor</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Evergreen users and organization units</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>asset</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Call numbers and copies</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>auditor</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Track history of changes to selected tables</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>authority</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Authority records</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>biblio</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Bibliographic records</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>booking</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Resource bookings</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>config</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Evergreen configurable options</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>container</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Buckets for records, call numbers, copies, and users</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>extend_reporter</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Extra views for report definitions</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>metabib</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Metadata about bibliographic records</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>money</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Fines and bills</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>offline</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Offline transactions</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>permission</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>User permissions</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>query</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Stored SQL statements</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>reporter</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Report definitions</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>search</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Search functions</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>serial</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Serial MFHD records</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>stats</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Convenient views of circulation and asset statistics</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>vandelay</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>MARC batch importer and exporter</simpara></entry>\r
- </row>\r
- </tbody>\r
- </tgroup>\r
- </table>\r
- <note><simpara>The term <emphasis>schema</emphasis> has two meanings in the world of SQL databases. We have\r
- discussed the schema as a conceptual grouping of tables and other database\r
- objects within a given namespace; for example, "the <emphasis role="strong">actor</emphasis> schema contains the\r
- tables and functions related to users and organizational units". Another common\r
- usage of <emphasis>schema</emphasis> is to refer to the entire data model for a given database;\r
- for example, "the Evergreen database schema".</simpara></note>\r
- </simplesect>\r
- <simplesect id="_columns">\r
- <title>Columns</title>\r
- <simpara>Each column definition consists of:</simpara>\r
- <itemizedlist>\r
- <listitem>\r
- <simpara>\r
- a data type\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- (optionally) a default value to be used whenever a row is inserted that\r
- does not contain a specific value\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- (optionally) one or more constraints on the values beyond data type\r
- </simpara>\r
- </listitem>\r
- </itemizedlist>\r
- <simpara>Although PostgreSQL supports dozens of data types, Evergreen makes our life\r
- easier by only using a handful.</simpara>\r
- <table\r
- frame="all"\r
- rowsep="1" colsep="1"\r
- >\r
- <title>PostgreSQL data types used by Evergreen</title>\r
- <?dbhtml table-width="90%"?>\r
- <?dbfo table-width="90%"?>\r
- <tgroup cols="3">\r
- <colspec colname="col_1" colwidth="1.0*"/>\r
- <colspec colname="col_2" colwidth="1.0*"/>\r
- <colspec colname="col_3" colwidth="2.5*"/>\r
- <thead>\r
- <row>\r
- <entry align="left" valign="top">Type name </entry>\r
- <entry align="left" valign="top">Description </entry>\r
- <entry align="left" valign="top">Limits</entry>\r
- </row>\r
- </thead>\r
- <tbody>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>INTEGER</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Medium integer</simpara></entry>\r
- <entry align="left" valign="top"><simpara>-2147483648 to +2147483647</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>BIGINT</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Large integer</simpara></entry>\r
- <entry align="left" valign="top"><simpara>-9223372036854775808 to 9223372036854775807</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>SERIAL</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Sequential integer</simpara></entry>\r
- <entry align="left" valign="top"><simpara>1 to 2147483647</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>BIGSERIAL</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Large sequential integer</simpara></entry>\r
- <entry align="left" valign="top"><simpara>1 to 9223372036854775807</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>TEXT</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Variable length character data</simpara></entry>\r
- <entry align="left" valign="top"><simpara>Unlimited length</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>BOOL</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Boolean</simpara></entry>\r
- <entry align="left" valign="top"><simpara>TRUE or FALSE</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>TIMESTAMP WITH TIME ZONE</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Timestamp</simpara></entry>\r
- <entry align="left" valign="top"><simpara>4713 BC to 294276 AD</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>TIME</literal></simpara></entry>\r
- <entry align="left" valign="top"><simpara>Time</simpara></entry>\r
- <entry align="left" valign="top"><simpara>Expressed in HH:MM:SS</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara><literal>NUMERIC</literal>(precision, scale)</simpara></entry>\r
- <entry align="left" valign="top"><simpara>Decimal</simpara></entry>\r
- <entry align="left" valign="top"><simpara>Up to 1000 digits of precision. In Evergreen mostly used for money\r
- values, with a precision of 6 and a scale of 2 (<literal>####.##</literal>).</simpara></entry>\r
- </row>\r
- </tbody>\r
- </tgroup>\r
- </table>\r
- <simpara>Full details about these data types are available from the\r
- <ulink url="http://www.postgresql.org/docs/8.4/static/datatype.html">data types section of\r
- the PostgreSQL manual</ulink>.</simpara>\r
- </simplesect>\r
- <simplesect id="_constraints">\r
- <title>Constraints</title>\r
- <simplesect id="_prevent_null_values">\r
- <title>Prevent NULL values</title>\r
- <simpara>A column definition may include the constraint <literal>NOT NULL</literal> to prevent NULL\r
- values. In PostgreSQL, a NULL value is not the equivalent of zero or false or\r
- an empty string; it is an explicit non-value with special properties. We’ll\r
- talk more about how to work with NULL values when we get to queries.</simpara>\r
- </simplesect>\r
- <simplesect id="_primary_key">\r
- <title>Primary key</title>\r
- <simpara>Every table can have at most one primary key. A primary key consists of one or\r
- more columns which together uniquely identify each row in a table. If you\r
- attempt to insert a row into a table that would create a duplicate or NULL\r
- primary key entry, the database rejects the row and returns an error.</simpara>\r
- <simpara>Natural primary keys are drawn from the intrinsic properties of the data being\r
- modelled. For example, some potential natural primary keys for a table that\r
- contains people would be:</simpara>\r
- <table\r
- frame="all"\r
- rowsep="1" colsep="1"\r
- >\r
- <title>Example: Some potential natural primary keys for a table of people</title>\r
- <?dbhtml table-width="90%"?>\r
- <?dbfo table-width="90%"?>\r
- <tgroup cols="3">\r
- <colspec colname="col_1" colwidth="1.0*"/>\r
- <colspec colname="col_2" colwidth="2.0*"/>\r
- <colspec colname="col_3" colwidth="2.0*"/>\r
- <thead>\r
- <row>\r
- <entry align="left" valign="top">Natural key </entry>\r
- <entry align="left" valign="top">Pros </entry>\r
- <entry align="left" valign="top">Cons</entry>\r
- </row>\r
- </thead>\r
- <tbody>\r
- <row>\r
- <entry align="left" valign="top"><simpara>First name, last name, address</simpara></entry>\r
- <entry align="left" valign="top"><simpara>No two people with the same name would ever live at the same address, right?</simpara></entry>\r
- <entry align="left" valign="top"><simpara>Lots of columns force data duplication in referencing tables</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara>SSN or driver’s license</simpara></entry>\r
- <entry align="left" valign="top"><simpara>These are guaranteed to be unique</simpara></entry>\r
- <entry align="left" valign="top"><simpara>Lots of people don’t have an SSN or a driver’s license</simpara></entry>\r
- </row>\r
- </tbody>\r
- </tgroup>\r
- </table>\r
- <simpara>To avoid problems with natural keys, many applications instead define surrogate\r
- primary keys. A surrogate primary keys is a column with an autoincrementing\r
- integer value added to a table definition that ensures uniqueness.</simpara>\r
- <simpara>Evergreen uses surrogate keys (a column named <literal>id</literal> with a <literal>SERIAL</literal> data type)\r
- for most of its tables.</simpara>\r
- </simplesect>\r
- <simplesect id="_foreign_keys">\r
- <title>Foreign keys</title>\r
- <simpara>Every table can contain zero or more foreign keys: one or more columns that\r
- refer to the primary key of another table.</simpara>\r
- <simpara>For example, let’s consider Evergreen’s modelling of the basic relationship\r
- between copies, call numbers, and bibliographic records. Bibliographic records\r
- contained in the <literal>biblio.record_entry</literal> table can have call numbers attached to\r
- them. Call numbers are contained in the <literal>asset.call_number</literal> table, and they can\r
- have copies attached to them. Copies are contained in the <literal>asset.copy</literal> table.</simpara>\r
- <table\r
- frame="all"\r
- rowsep="1" colsep="1"\r
- >\r
- <title>Example: Evergreen’s copy / call number / bibliographic record relationships</title>\r
- <?dbhtml table-width="100%"?>\r
- <?dbfo table-width="100%"?>\r
- <tgroup cols="4">\r
- <colspec colname="col_1" colwidth="1.0*"/>\r
- <colspec colname="col_2" colwidth="1.0*"/>\r
- <colspec colname="col_3" colwidth="1.0*"/>\r
- <colspec colname="col_4" colwidth="1.0*"/>\r
- <thead>\r
- <row>\r
- <entry align="left" valign="top">Table </entry>\r
- <entry align="left" valign="top">Primary key </entry>\r
- <entry align="left" valign="top">Column with a foreign key </entry>\r
- <entry align="left" valign="top">Points to</entry>\r
- </row>\r
- </thead>\r
- <tbody>\r
- <row>\r
- <entry align="left" valign="top"><simpara>asset.copy</simpara></entry>\r
- <entry align="left" valign="top"><simpara>asset.copy.id</simpara></entry>\r
- <entry align="left" valign="top"><simpara>asset.copy.call_number</simpara></entry>\r
- <entry align="left" valign="top"><simpara>asset.call_number.id</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara>asset.call_number</simpara></entry>\r
- <entry align="left" valign="top"><simpara>asset.call_number.id</simpara></entry>\r
- <entry align="left" valign="top"><simpara>asset.call_number.record</simpara></entry>\r
- <entry align="left" valign="top"><simpara>biblio.record_entry.id</simpara></entry>\r
- </row>\r
- <row>\r
- <entry align="left" valign="top"><simpara>biblio.record_entry</simpara></entry>\r
- <entry align="left" valign="top"><simpara>biblio.record_entry.id</simpara></entry>\r
- <entry align="left" valign="top"><simpara></simpara></entry>\r
- <entry align="left" valign="top"><simpara></simpara></entry>\r
- </row>\r
- </tbody>\r
- </tgroup>\r
- </table>\r
- </simplesect>\r
- <simplesect id="_check_constraints">\r
- <title>Check constraints</title>\r
- <simpara>PostgreSQL enables you to define rules to ensure that the value to be inserted\r
- or updated meets certain conditions. For example, you can ensure that an\r
- incoming integer value is within a specific range, or that a ZIP code matches a\r
- particular pattern.</simpara>\r
- </simplesect>\r
- </simplesect>\r
- <simplesect id="_deconstructing_a_table_definition_statement">\r
- <title>Deconstructing a table definition statement</title>\r
- <simpara>The <literal>actor.org_address</literal> table is a simple table in the Evergreen schema that\r
- we can use as a concrete example of many of the properties of databases that\r
- we have discussed so far.</simpara>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-CREATE TABLE actor.org_address (\r
- id SERIAL PRIMARY KEY, <co id="sqlCO1-1"/>\r
- valid BOOL NOT NULL DEFAULT TRUE, <co id="sqlCO1-2"/>\r
- address_type TEXT NOT NULL DEFAULT 'MAILING', <co id="sqlCO1-3"/>\r
- org_unit INT NOT NULL REFERENCES actor.org_unit (id) <co id="sqlCO1-4"/>\r
- DEFERRABLE INITIALLY DEFERRED,\r
- street1 TEXT NOT NULL,\r
- street2 TEXT, <co id="sqlCO1-5"/>\r
- city TEXT NOT NULL,\r
- county TEXT,\r
- state TEXT NOT NULL,\r
- country TEXT NOT NULL,\r
- post_code TEXT NOT NULL\r
-);\r
-</programlisting>\r
- <calloutlist>\r
- <callout arearefs="sqlCO1-1">\r
- <simpara>\r
- The column named <literal>id</literal> is defined with a special data type of <literal>SERIAL</literal>; if\r
- given no value when a row is inserted into a table, the database automatically\r
- generates the next sequential integer value for the column. <literal>SERIAL</literal> is a\r
- popular data type for a primary key because it is guaranteed to be unique - and\r
- indeed, the constraint for this column identifies it as the <literal>PRIMARY KEY</literal>.\r
- </simpara>\r
- </callout>\r
- <callout arearefs="sqlCO1-2">\r
- <simpara>\r
- The data type <literal>BOOL</literal> defines a boolean value: <literal>TRUE</literal> or <literal>FALSE</literal> are the only\r
- acceptable values for the column. The constraint <literal>NOT NULL</literal> instructs the\r
- database to prevent the column from ever containing a NULL value. The column\r
- property <literal>DEFAULT TRUE</literal> instructs the database to automatically set the value\r
- of the column to <literal>TRUE</literal> if no value is provided.\r
- </simpara>\r
- </callout>\r
- <callout arearefs="sqlCO1-3">\r
- <simpara>\r
- The data type <literal>TEXT</literal> defines a text column of practically unlimited length.\r
- As with the previous column, there is a <literal>NOT NULL</literal> constraint, and a default\r
- value of <literal>'MAILING'</literal> will result if no other value is supplied.\r
- </simpara>\r
- </callout>\r
- <callout arearefs="sqlCO1-4">\r
- <simpara>\r
- The <literal>REFERENCES actor.org_unit (id)</literal> clause indicates that this column has a\r
- foreign key relationship to the <literal>actor.org_unit</literal> table, and that the value of\r
- this column in every row in this table must have a corresponding value in the\r
- <literal>id</literal> column in the referenced table (<literal>actor.org_unit</literal>).\r
- </simpara>\r
- </callout>\r
- <callout arearefs="sqlCO1-5">\r
- <simpara>\r
- The column named <literal>street2</literal> demonstrates that not all columns have constraints\r
- beyond data type. In this case, the column is allowed to be NULL or to contain a\r
- <literal>TEXT</literal> value.\r
- </simpara>\r
- </callout>\r
- </calloutlist>\r
- </simplesect>\r
- <simplesect id="_displaying_a_table_definition_using_literal_psql_literal">\r
- <title>Displaying a table definition using <literal>psql</literal></title>\r
- <simpara>The <literal>psql</literal> command-line interface is the preferred method for accessing\r
- PostgreSQL databases. It offers features like tab-completion, readline support\r
- for recalling previous commands, flexible input and output formats, and\r
- is accessible via a standard SSH session.</simpara>\r
- <simpara>If you press the <literal>Tab</literal> key once after typing one or more characters of the\r
- database object name, <literal>psql</literal> automatically completes the name if there are no\r
- other matches. If there are other matches for your current input, nothing\r
- happens until you press the <literal>Tab</literal> key a second time, at which point <literal>psql</literal>\r
- displays all of the matches for your current input.</simpara>\r
- <simpara>To display the definition of a database object such as a table, issue the\r
- command <literal>\d _object-name_</literal>. For example, to display the definition of the\r
- actor.usr_note table:</simpara>\r
-<programlisting language="sh" linenumbering="unnumbered">\r
-$ psql evergreen <co id="sqlCO2-1"/>\r
-psql (8.4.1)\r
-Type "help" for help.\r
-\r
-evergreen=# \d actor.usr_note <co id="sqlCO2-2"/>\r
- Table "actor.usr_note"\r
- Column | Type | Modifiers\r
--------------+--------------------------+-------------------------------------------------------------\r
- id | bigint | not null default nextval('actor.usr_note_id_seq'::regclass)\r
- usr | bigint | not null\r
- creator | bigint | not null\r
- create_date | timestamp with time zone | default now()\r
- pub | boolean | not null default false\r
- title | text | not null\r
- value | text | not null\r
-Indexes:\r
- "usr_note_pkey" PRIMARY KEY, btree (id)\r
- "actor_usr_note_creator_idx" btree (creator)\r
- "actor_usr_note_usr_idx" btree (usr)\r
-Foreign-key constraints:\r
- "usr_note_creator_fkey" FOREIGN KEY (creator) REFERENCES actor.usr(id) ON ...\r
- "usr_note_usr_fkey" FOREIGN KEY (usr) REFERENCES actor.usr(id) ON DELETE ....\r
-\r
-evergreen=# \q <co id="sqlCO2-3"/>\r
-$\r
-</programlisting>\r
- <calloutlist>\r
- <callout arearefs="sqlCO2-1">\r
- <simpara>\r
- This is the most basic connection to a PostgreSQL database. You can use a\r
- number of other flags to specify user name, hostname, port, and other options.\r
- </simpara>\r
- </callout>\r
- <callout arearefs="sqlCO2-2">\r
- <simpara>\r
- The <literal>\d</literal> command displays the definition of a database object.\r
- </simpara>\r
- </callout>\r
- <callout arearefs="sqlCO2-3">\r
- <simpara>\r
- The <literal>\q</literal> command quits the <literal>psql</literal> session and returns you to the shell prompt.\r
- </simpara>\r
- </callout>\r
- </calloutlist>\r
- </simplesect>\r
- </section>\r
- <section id="basic_sql_queries">\r
- <title>Basic SQL queries</title>\r
- <simplesect id="_the_select_statement">\r
- <title>The SELECT statement</title>\r
- <simpara>The SELECT statement is the basic tool for retrieving information from a\r
- database. The syntax for most SELECT statements is:</simpara>\r
- <blockquote>\r
- <literallayout><literal>SELECT</literal> [<emphasis>columns(s)</emphasis>]\r
- <literal>FROM</literal> [<emphasis>table(s)</emphasis>]\r
- [<literal>WHERE</literal> <emphasis>condition(s)</emphasis>]\r
- [<literal>GROUP BY</literal> <emphasis>columns(s)</emphasis>]\r
- [<literal>HAVING</literal> <emphasis>grouping-condition(s)</emphasis>]\r
- [<literal>ORDER BY</literal> <emphasis>column(s)</emphasis>]\r
- [<literal>LIMIT</literal> <emphasis>maximum-results</emphasis>]\r
- [<literal>OFFSET</literal> <emphasis>start-at-result-#</emphasis>]\r
- ;</literallayout>\r
- </blockquote>\r
- <simpara>For example, to select all of the columns for each row in the\r
- <literal>actor.usr_address</literal> table, issue the following query:</simpara>\r
- <programlisting language="sql" linenumbering="unnumbered">SELECT *\r
- FROM actor.usr_address\r
- ;</programlisting>\r
- </simplesect>\r
- <simplesect id="_selecting_particular_columns_from_a_table">\r
- <title>Selecting particular columns from a table</title>\r
- <simpara><literal>SELECT *</literal> returns all columns from all of the tables included in your query.\r
- However, quite often you will want to return only a subset of the possible\r
- columns. You can retrieve specific columns by listing the names of the columns\r
- you want after the <literal>SELECT</literal> keyword. Separate each column name with a comma.</simpara>\r
- <simpara>For example, to select just the city, county, and state from the\r
- actor.usr_address table, issue the following query:</simpara>\r
- <programlisting language="sql" linenumbering="unnumbered">SELECT city, county, state\r
- FROM actor.usr_address\r
- ;</programlisting>\r
- </simplesect>\r
- <simplesect id="_sorting_results_with_the_order_by_clause">\r
- <title>Sorting results with the ORDER BY clause</title>\r
- <simpara>By default, a SELECT statement returns rows matching your query with no\r
- guarantee of any particular order in which they are returned. To force\r
- the rows to be returned in a particular order, use the ORDER BY clause\r
- to specify one or more columns to determine the sorting priority of the\r
- rows.</simpara>\r
- <simpara>For example, to sort the rows returned from your <literal>actor.usr_address</literal> query by\r
- city, with county and then zip code as the tie breakers, issue the\r
- following query:</simpara>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT city, county, state\r
- FROM actor.usr_address\r
- ORDER BY city, county, post_code\r
-;\r
-</programlisting>\r
- </simplesect>\r
- <simplesect id="_filtering_results_with_the_where_clause">\r
- <title>Filtering results with the WHERE clause</title>\r
- <simpara>Thus far, your results have been returning all of the rows in the table.\r
- Normally, however, you would want to restrict the rows that are returned to the\r
- subset of rows that match one or more conditions of your search. The <literal>WHERE</literal>\r
- clause enables you to specify a set of conditions that filter your query\r
- results. Each condition in the <literal>WHERE</literal> clause is an SQL expression that returns\r
- a boolean (true or false) value.</simpara>\r
- <simpara>For example, to restrict the results returned from your <literal>actor.usr_address</literal>\r
- query to only those rows containing a state value of <emphasis>Connecticut</emphasis>, issue the\r
- following query:</simpara>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT city, county, state\r
- FROM actor.usr_address\r
- WHERE state = 'Connecticut'\r
- ORDER BY city, county, post_code\r
-;\r
-</programlisting>\r
- <simpara>You can include more conditions in the <literal>WHERE</literal> clause with the <literal>OR</literal> and <literal>AND</literal>\r
- operators. For example, to further restrict the results returned from your\r
- <literal>actor.usr_address</literal> query to only those rows where the state column contains a\r
- value of <emphasis>Connecticut</emphasis> and the city column contains a value of <emphasis>Hartford</emphasis>,\r
- issue the following query:</simpara>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT city, county, state\r
- FROM actor.usr_address\r
- WHERE state = 'Connecticut'\r
- AND city = 'Hartford'\r
- ORDER BY city, county, post_code\r
-;\r
-</programlisting>\r
- <note><simpara>To return rows where the state is <emphasis>Connecticut</emphasis> and the city is <emphasis>Hartford</emphasis> or\r
- <emphasis>New Haven</emphasis>, you must use parentheses to explicitly group the city value\r
- conditions together, or else the database will evaluate the <literal>OR city = 'New\r
- Haven'</literal> clause entirely on its own and match all rows where the city column is\r
- <emphasis>New Haven</emphasis>, even though the state might not be <emphasis>Connecticut</emphasis>.</simpara></note>\r
- <formalpara><title>Trouble with OR</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT city, county, state\r
- FROM actor.usr_address\r
- WHERE state = 'Connecticut'\r
- AND city = 'Hartford' OR city = 'New Haven'\r
- ORDER BY city, county, post_code\r
-;\r
-\r
--- Can return unwanted rows because the OR is not grouped!\r
-</programlisting>\r
- </para></formalpara>\r
- <formalpara><title>Grouped OR’ed conditions</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT city, county, state\r
- FROM actor.usr_address\r
- WHERE state = 'Connecticut'\r
- AND (city = 'Hartford' OR city = 'New Haven')\r
- ORDER BY city, county, post_code\r
-;\r
-\r
--- The parentheses ensure that the OR is applied to the cities, and the\r
--- state in either case must be 'Connecticut'\r
-</programlisting>\r
- </para></formalpara>\r
- <simplesect id="_comparison_operators">\r
- <title>Comparison operators</title>\r
- <simpara>Here is a partial list of comparison operators that are commonly used in\r
- <literal>WHERE</literal> clauses:</simpara>\r
- <simplesect id="_comparing_two_scalar_values">\r
- <title>Comparing two scalar values</title>\r
- <itemizedlist>\r
- <listitem>\r
- <simpara>\r
- <literal>x = y</literal> (equal to)\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- <literal>x != y</literal> (not equal to)\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- <literal>x < y</literal> (less than)\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- <literal>x > y</literal> (greater than)\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- <literal>x LIKE y</literal> (TEXT value x matches a subset of TEXT y, where y is a string that\r
- can contain <emphasis>%</emphasis> as a wildcard for 0 or more characters, and <emphasis>_</emphasis> as a wildcard\r
- for a single character. For example, <literal>WHERE 'all you can eat fish and chips\r
- and a big stick' LIKE '%fish%stick'</literal> would return TRUE)\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- <literal>x ILIKE y</literal> (like LIKE, but the comparison ignores upper-case / lower-case)\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- <literal>x IN y</literal> (x is in the list of values y, where y can be a list or a SELECT\r
- statement that returns a list)\r
- </simpara>\r
- </listitem>\r
- </itemizedlist>\r
- </simplesect>\r
- </simplesect>\r
- </simplesect>\r
- <simplesect id="_null_values">\r
- <title>NULL values</title>\r
- <simpara>SQL databases have a special way of representing the value of a column that has\r
- no value: <literal>NULL</literal>. A <literal>NULL</literal> value is not equal to zero, and is not an empty\r
- string; it is equal to nothing, not even another <literal>NULL</literal>, because it has no value\r
- that can be compared.</simpara>\r
- <simpara>To return rows from a table where a given column is not <literal>NULL</literal>, use the\r
- <literal>IS NOT NULL</literal> comparison operator.</simpara>\r
- <formalpara><title>Retrieving rows where a column is not <literal>NULL</literal></title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT id, first_given_name, family_name\r
- FROM actor.usr\r
- WHERE second_given_name IS NOT NULL\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>Similarly, to return rows from a table where a given column is <literal>NULL</literal>, use\r
- the <literal>IS NULL</literal> comparison operator.</simpara>\r
- <formalpara><title>Retrieving rows where a column is <literal>NULL</literal></title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT id, first_given_name, second_given_name, family_name\r
- FROM actor.usr\r
- WHERE second_given_name IS NULL\r
-;\r
-\r
- id | first_given_name | second_given_name | family_name\r
-----+------------------+-------------------+----------------\r
- 1 | Administrator | | System Account\r
-(1 row)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>Notice that the <literal>NULL</literal> value in the output is displayed as empty space,\r
- indistinguishable from an empty string; this is the default display method in\r
- <literal>psql</literal>. You can change the behaviour of <literal>psql</literal> using the <literal>pset</literal> command:</simpara>\r
- <formalpara><title>Changing the way <literal>NULL</literal> values are displayed in <literal>psql</literal></title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-evergreen=# \pset null '(null)'\r
-Null display is '(null)'.\r
-\r
-SELECT id, first_given_name, second_given_name, family_name\r
- FROM actor.usr\r
- WHERE second_given_name IS NULL\r
-;\r
-\r
- id | first_given_name | second_given_name | family_name\r
-----+------------------+-------------------+----------------\r
- 1 | Administrator | (null) | System Account\r
-(1 row)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>Database queries within programming languages such as Perl and C have\r
- special methods of checking for <literal>NULL</literal> values in returned results.</simpara>\r
- </simplesect>\r
- <simplesect id="_text_delimiter">\r
- <title>Text delimiter: '</title>\r
- <simpara>You might have noticed that we have been using the <literal>'</literal> character to delimit\r
- TEXT values and values such as dates and times that are TEXT values. Sometimes,\r
- however, your TEXT value itself contains a <literal>'</literal> character, such as the word\r
- <literal>you’re</literal>. To prevent the database from prematurely ending the TEXT value at the\r
- first <literal>'</literal> character and returning a syntax error, use another <literal>'</literal> character to\r
- escape the following <literal>'</literal> character.</simpara>\r
- <simpara>For example, to change the last name of a user in the <literal>actor.usr</literal> table to\r
- <literal>L’estat</literal>, issue the following SQL:</simpara>\r
- <formalpara><title>Escaping <literal>'</literal> in TEXT values</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-UPDATE actor.usr\r
- SET family_name = 'L''estat'\r
- WHERE profile IN (\r
- SELECT id\r
- FROM permission.grp_tree\r
- WHERE name = 'Vampire'\r
- )\r
- ;</programlisting>\r
- </para></formalpara>\r
- <simpara>When you retrieve the row from the database, the value is displayed with just\r
- a single <literal>'</literal> character:</simpara>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT id, family_name\r
- FROM actor.usr\r
- WHERE family_name = 'L''estat'\r
-;\r
-\r
- id | family_name\r
-----+-------------\r
- 1 | L'estat\r
-(1 row)\r
-</programlisting>\r
- </simplesect>\r
- <simplesect id="_grouping_and_eliminating_results_with_the_group_by_and_having_clauses">\r
- <title>Grouping and eliminating results with the GROUP BY and HAVING clauses</title>\r
- <simpara>The GROUP BY clause returns a unique set of results for the desired columns.\r
- This is most often used in conjunction with an aggregate function to present\r
- results for a range of values in a single query, rather than requiring you to\r
- issue one query per target value.</simpara>\r
- <formalpara><title>Returning unique results of a single column with <literal>GROUP BY</literal></title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT grp\r
- FROM permission.grp_perm_map\r
- GROUP BY grp\r
- ORDER BY grp;\r
-\r
- grp\r
------+\r
- 1\r
- 2\r
- 3\r
- 4\r
- 5\r
- 6\r
- 7\r
- 10\r
-(8 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>While <literal>GROUP BY</literal> can be useful for a single column, it is more often used\r
- to return the distinct results across multiple columns. For example, the\r
- following query shows us which groups have permissions at each depth in\r
- the library hierarchy:</simpara>\r
- <formalpara><title>Returning unique results of multiple columns with <literal>GROUP BY</literal></title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT grp, depth\r
- FROM permission.grp_perm_map\r
- GROUP BY grp, depth\r
- ORDER BY depth, grp;\r
-\r
- grp | depth\r
------+-------\r
- 1 | 0\r
- 2 | 0\r
- 3 | 0\r
- 4 | 0\r
- 5 | 0\r
- 10 | 0\r
- 3 | 1\r
- 4 | 1\r
- 5 | 1\r
- 6 | 1\r
- 7 | 1\r
- 10 | 1\r
- 3 | 2\r
- 4 | 2\r
- 10 | 2\r
-(15 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>Extending this further, you can use the <literal>COUNT()</literal> aggregate function to\r
- also return the number of times each unique combination of <literal>grp</literal> and <literal>depth</literal>\r
- appears in the table. <emphasis>Yes, this is a sneak peek at the use of aggregate\r
- functions! Keeners.</emphasis></simpara>\r
- <formalpara><title>Counting unique column combinations with <literal>GROUP BY</literal></title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT grp, depth, COUNT(grp)\r
- FROM permission.grp_perm_map\r
- GROUP BY grp, depth\r
- ORDER BY depth, grp;\r
-\r
- grp | depth | count\r
------+-------+-------\r
- 1 | 0 | 6\r
- 2 | 0 | 2\r
- 3 | 0 | 45\r
- 4 | 0 | 3\r
- 5 | 0 | 5\r
- 10 | 0 | 1\r
- 3 | 1 | 3\r
- 4 | 1 | 4\r
- 5 | 1 | 1\r
- 6 | 1 | 9\r
- 7 | 1 | 5\r
- 10 | 1 | 10\r
- 3 | 2 | 24\r
- 4 | 2 | 8\r
- 10 | 2 | 7\r
-(15 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>You can use the <literal>WHERE</literal> clause to restrict the returned results before grouping\r
- is applied to the results. The following query restricts the results to those\r
- rows that have a depth of 0.</simpara>\r
- <formalpara><title>Using the <literal>WHERE</literal> clause with <literal>GROUP BY</literal></title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT grp, COUNT(grp)\r
- FROM permission.grp_perm_map\r
- WHERE depth = 0\r
- GROUP BY grp\r
- ORDER BY 2 DESC\r
-;\r
-\r
- grp | count\r
------+-------\r
- 3 | 45\r
- 1 | 6\r
- 5 | 5\r
- 4 | 3\r
- 2 | 2\r
- 10 | 1\r
-(6 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>To restrict results after grouping has been applied to the rows, use the\r
- <literal>HAVING</literal> clause; this is typically used to restrict results based on\r
- a comparison to the value returned by an aggregate function. For example,\r
- the following query restricts the returned rows to those that have more than\r
- 5 occurrences of the same value for <literal>grp</literal> in the table.</simpara>\r
- <formalpara><title><literal>GROUP BY</literal> restricted by a <literal>HAVING</literal> clause</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT grp, COUNT(grp)\r
- FROM permission.grp_perm_map\r
- GROUP BY grp\r
- HAVING COUNT(grp) > 5\r
-;\r
-\r
- grp | count\r
------+-------\r
- 6 | 9\r
- 4 | 15\r
- 5 | 6\r
- 1 | 6\r
- 3 | 72\r
- 10 | 18\r
-(6 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_eliminating_duplicate_results_with_the_distinct_keyword">\r
- <title>Eliminating duplicate results with the DISTINCT keyword</title>\r
- <simpara><literal>GROUP BY</literal> is one way of eliminating duplicate results from the rows returned\r
- by your query. The purpose of the <literal>DISTINCT</literal> keyword is to remove duplicate\r
- rows from the results of your query. However, it works, and it is easy - so if\r
- you just want a quick list of the unique set of values for a column or set of\r
- columns, the <literal>DISTINCT</literal> keyword might be appropriate.</simpara>\r
- <simpara>On the other hand, if you are getting duplicate rows back when you don’t expect\r
- them, then applying the <literal>DISTINCT</literal> keyword might be a sign that you are\r
- papering over a real problem.</simpara>\r
- <formalpara><title>Returning unique results of multiple columns with <literal>DISTINCT</literal></title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT DISTINCT grp, depth\r
- FROM permission.grp_perm_map\r
- ORDER BY depth, grp\r
-;\r
-\r
- grp | depth\r
------+-------\r
- 1 | 0\r
- 2 | 0\r
- 3 | 0\r
- 4 | 0\r
- 5 | 0\r
- 10 | 0\r
- 3 | 1\r
- 4 | 1\r
- 5 | 1\r
- 6 | 1\r
- 7 | 1\r
- 10 | 1\r
- 3 | 2\r
- 4 | 2\r
- 10 | 2\r
-(15 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_paging_through_results_with_the_limit_and_offset_clauses">\r
- <title>Paging through results with the LIMIT and OFFSET clauses</title>\r
- <simpara>The <literal>LIMIT</literal> clause restricts the total number of rows returned from your query\r
- and is useful if you just want to list a subset of a large number of rows. For\r
- example, in the following query we list the five most frequently used\r
- circulation modifiers:</simpara>\r
- <formalpara><title>Using the <literal>LIMIT</literal> clause to restrict results</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT circ_modifier, COUNT(circ_modifier)\r
- FROM asset.copy\r
- GROUP BY circ_modifier\r
- ORDER BY 2 DESC\r
- LIMIT 5\r
-;\r
-\r
- circ_modifier | count\r
----------------+--------\r
- CIRC | 741995\r
- BOOK | 636199\r
- SER | 265906\r
- DOC | 191598\r
- LAW MONO | 126627\r
-(5 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>When you use the <literal>LIMIT</literal> clause to restrict the total number of rows returned\r
- by your query, you can also use the <literal>OFFSET</literal> clause to determine which subset\r
- of the rows will be returned. The use of the <literal>OFFSET</literal> clause assumes that\r
- you’ve used the <literal>ORDER BY</literal> clause to impose order on the results.</simpara>\r
- <simpara>In the following example, we use the <literal>OFFSET</literal> clause to get results 6 through\r
- 10 from the same query that we prevously executed.</simpara>\r
- <formalpara><title>Using the <literal>OFFSET</literal> clause to return a specific subset of rows</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT circ_modifier, COUNT(circ_modifier)\r
- FROM asset.copy\r
- GROUP BY circ_modifier\r
- ORDER BY 2 DESC\r
- LIMIT 5\r
- OFFSET 5\r
-;\r
-\r
- circ_modifier | count\r
----------------+--------\r
- LAW SERIAL | 102758\r
- DOCUMENTS | 86215\r
- BOOK_WEB | 63786\r
- MFORM SER | 39917\r
- REF | 34380\r
-(5 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- </section>\r
- <section id="advanced_sql_queries">\r
- <title>Advanced SQL queries</title>\r
- <simplesect id="_transforming_column_values_with_functions">\r
- <title>Transforming column values with functions</title>\r
- <simpara>PostgreSQL includes many built-in functions for manipulating column data.\r
- You can also create your own functions (and Evergreen does make use of\r
- many custom functions). There are two types of functions used in\r
- databases: scalar functions and aggregate functions.</simpara>\r
- <simplesect id="_scalar_functions">\r
- <title>Scalar functions</title>\r
- <simpara>Scalar functions transform each value of the target column. If your query\r
- would return 50 values for a column in a given query, and you modify your\r
- query to apply a scalar function to the values returned for that column,\r
- it will still return 50 values. For example, the UPPER() function,\r
- used to convert text values to upper-case, modifies the results in the\r
- following set of queries:</simpara>\r
- <formalpara><title>Using the UPPER() scalar function to convert text values to upper-case</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
--- First, without the UPPER() function for comparison\r
-SELECT shortname, name\r
- FROM actor.org_unit\r
- WHERE id < 4\r
-;\r
-\r
- shortname | name\r
------------+-----------------------\r
- CONS | Example Consortium\r
- SYS1 | Example System 1\r
- SYS2 | Example System 2\r
-(3 rows)\r
-\r
--- Now apply the UPPER() function to the name column\r
-SELECT shortname, UPPER(name)\r
- FROM actor.org_unit\r
- WHERE id < 4\r
-;\r
-\r
- shortname | upper\r
------------+--------------------\r
- CONS | EXAMPLE CONSORTIUM\r
- SYS1 | EXAMPLE SYSTEM 1\r
- SYS2 | EXAMPLE SYSTEM 2\r
-(3 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>There are so many scalar functions in PostgreSQL that we cannot cover them\r
- all here, but we can list some of the most commonly used functions:</simpara>\r
- <itemizedlist>\r
- <listitem>\r
- <simpara>\r
- || - concatenates two text values together\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- COALESCE() - returns the first non-NULL value from the list of arguments\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- LOWER() - returns a text value converted to lower-case\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- REPLACE() - returns a text value after replacing all occurrences of a given text value with a different text value\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- REGEXP_REPLACE() - returns a text value after being transformed by a regular expression\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- UPPER() - returns a text value converted to upper-case\r
- </simpara>\r
- </listitem>\r
- </itemizedlist>\r
- <simpara>For a complete list of scalar functions, see\r
- <ulink url="http://www.postgresql.org/docs/8.3/interactive/functions.html">the PostgreSQL function documentation</ulink>.</simpara>\r
- </simplesect>\r
- <simplesect id="_aggregate_functions">\r
- <title>Aggregate functions</title>\r
- <simpara>Aggregate functions return a single value computed from the the complete set of\r
- values returned for the specified column.</simpara>\r
- <itemizedlist>\r
- <listitem>\r
- <simpara>\r
- AVG()\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- COUNT()\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- MAX()\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- MIN()\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- SUM()\r
- </simpara>\r
- </listitem>\r
- </itemizedlist>\r
- </simplesect>\r
- </simplesect>\r
- <simplesect id="_sub_selects">\r
- <title>Sub-selects</title>\r
- <simpara>A sub-select is the technique of using the results of one query to feed\r
- into another query. You can, for example, return a set of values from\r
- one column in a SELECT statement to be used to satisfy the IN() condition\r
- of another SELECT statement; or you could return the MAX() value of a\r
- column in a SELECT statement to match the = condition of another SELECT\r
- statement.</simpara>\r
- <simpara>For example, in the following query we use a sub-select to restrict the copies\r
- returned by the main SELECT statement to only those locations that have an\r
- <literal>opac_visible</literal> value of <literal>TRUE</literal>:</simpara>\r
- <formalpara><title>Sub-select example</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT call_number\r
- FROM asset.copy\r
- WHERE deleted IS FALSE\r
- AND location IN (\r
- SELECT id\r
- FROM asset.copy_location\r
- WHERE opac_visible IS TRUE\r
- )\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>Sub-selects can be an approachable way to breaking down a problem that\r
- requires matching values between different tables, and often result in\r
- a clearly expressed solution to a problem. However, if you start writing\r
- sub-selects within sub-selects, you should consider tackling the problem\r
- with joins instead.</simpara>\r
- </simplesect>\r
- <simplesect id="_joins">\r
- <title>Joins</title>\r
- <simpara>Joins enable you to access the values from multiple tables in your query\r
- results and comparison operators. For example, joins are what enable you to\r
- relate a bibliographic record to a barcoded copy via the <literal>biblio.record_entry</literal>,\r
- <literal>asset.call_number</literal>, and <literal>asset.copy</literal> tables. In this section, we discuss the\r
- most common kind of join—the inner join—as well as the less common outer join\r
- and some set operations which can compare and contrast the values returned by\r
- separate queries.</simpara>\r
- <simpara>When we talk about joins, we are going to talk about the left-hand table and\r
- the right-hand table that participate in the join. Every join brings together\r
- just two tables - but you can use an unlimited (for our purposes) number\r
- of joins in a single SQL statement. Each time you use a join, you effectively\r
- create a new table, so when you add a second join clause to a statement,\r
- table 1 and table 2 (which were the left-hand table and the right-hand table\r
- for the first join) now act as a merged left-hand table and the new table\r
- in the second join clause is the right-hand table.</simpara>\r
- <simpara>Clear as mud? Okay, let’s look at some examples.</simpara>\r
- <simplesect id="_inner_joins">\r
- <title>Inner joins</title>\r
- <simpara>An inner join returns all of the columns from the left-hand table in the join\r
- with all of the columns from the right-hand table in the joins that match a\r
- condition in the ON clause. Typically, you use the <literal>=</literal> operator to match the\r
- foreign key of the left-hand table with the primary key of the right-hand\r
- table to follow the natural relationship between the tables.</simpara>\r
- <simpara>In the following example, we return all of columns from the <literal>actor.usr</literal> and\r
- <literal>actor.org_unit</literal> tables, joined on the relationship between the user’s home\r
- library and the library’s ID. Notice in the results that some columns, like\r
- <literal>id</literal> and <literal>mailing_address</literal>, appear twice; this is because both the <literal>actor.usr</literal>\r
- and <literal>actor.org_unit</literal> tables include columns with these names. This is also why\r
- we have to fully qualify the column names in our queries with the schema and\r
- table names.</simpara>\r
- <formalpara><title>A simple inner join</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT *\r
- FROM actor.usr\r
- INNER JOIN actor.org_unit ON actor.usr.home_ou = actor.org_unit.id\r
- WHERE actor.org_unit.shortname = 'CONS'\r
-;\r
-\r
--[ RECORD 1 ]------------------+---------------------------------\r
-id | 1\r
-card | 1\r
-profile | 1\r
-usrname | admin\r
-email |\r
-...\r
-mailing_address |\r
-billing_address |\r
-home_ou | 1\r
-...\r
-claims_never_checked_out_count | 0\r
-id | 1\r
-parent_ou |\r
-ou_type | 1\r
-ill_address | 1\r
-holds_address | 1\r
-mailing_address | 1\r
-billing_address | 1\r
-shortname | CONS\r
-name | Example Consortium\r
-email |\r
-phone |\r
-opac_visible | t\r
-fiscal_calendar | 1\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>Of course, you do not have to return every column from the joined tables;\r
- you can (and should) continue to specify only the columns that you want to\r
- return. In the following example, we count the number of borrowers for\r
- every user profile in a given library by joining the <literal>permission.grp_tree</literal>\r
- table where profiles are defined against the <literal>actor.usr</literal> table, and then\r
- joining the <literal>actor.org_unit</literal> table to give us access to the user’s home\r
- library:</simpara>\r
- <formalpara><title>Borrower Count by Profile (Adult, Child, etc)/Library</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT permission.grp_tree.name, actor.org_unit.name, COUNT(permission.grp_tree.name)\r
- FROM actor.usr\r
- INNER JOIN permission.grp_tree\r
- ON actor.usr.profile = permission.grp_tree.id\r
- INNER JOIN actor.org_unit\r
- ON actor.org_unit.id = actor.usr.home_ou\r
- WHERE actor.usr.deleted IS FALSE\r
- GROUP BY permission.grp_tree.name, actor.org_unit.name\r
- ORDER BY actor.org_unit.name, permission.grp_tree.name\r
-;\r
-\r
- name | name | count\r
--------+--------------------+-------\r
- Users | Example Consortium | 1\r
-(1 row)\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_aliases">\r
- <title>Aliases</title>\r
- <simpara>So far we have been fully-qualifying all of our table names and column names to\r
- prevent any confusion. This quickly gets tiring with lengthy qualified\r
- table names like <literal>permission.grp_tree</literal>, so the SQL syntax enables us to assign\r
- aliases to table names and column names. When you define an alias for a table\r
- name, you can access its column throughout the rest of the statement by simply\r
- appending the column name to the alias with a period; for example, if you assign\r
- the alias <literal>au</literal> to the <literal>actor.usr</literal> table, you can access the <literal>actor.usr.id</literal>\r
- column through the alias as <literal>au.id</literal>.</simpara>\r
- <simpara>The formal syntax for declaring an alias for a column is to follow the column\r
- name in the result columns clause with <literal>AS</literal> <emphasis>alias</emphasis>. To declare an alias for a table name,\r
- follow the table name in the FROM clause (including any JOIN statements) with\r
- <literal>AS</literal> <emphasis>alias</emphasis>. However, the <literal>AS</literal> keyword is optional for tables (and columns as\r
- of PostgreSQL 8.4), and in practice most SQL statements leave it out. For\r
- example, we can write the previous INNER JOIN statement example using aliases\r
- instead of fully-qualified identifiers:</simpara>\r
- <formalpara><title>Borrower Count by Profile (using aliases)</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
- FROM actor.usr au\r
- INNER JOIN permission.grp_tree pgt\r
- ON au.profile = pgt.id\r
- INNER JOIN actor.org_unit aou\r
- ON aou.id = au.home_ou\r
- WHERE au.deleted IS FALSE\r
- GROUP BY pgt.name, aou.name\r
- ORDER BY aou.name, pgt.name\r
-;\r
-\r
- Profile | Library | Count\r
----------+--------------------+-------\r
- Users | Example Consortium | 1\r
-(1 row)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>A nice side effect of declaring an alias for your columns is that the alias\r
- is used as the column header in the results table. The previous version of\r
- the query, which didn’t use aliased column names, had two columns named\r
- <literal>name</literal>; this version of the query with aliases results in a clearer\r
- categorization.</simpara>\r
- </simplesect>\r
- <simplesect id="_outer_joins">\r
- <title>Outer joins</title>\r
- <simpara>An outer join returns all of the rows from one or both of the tables\r
- participating in the join.</simpara>\r
- <itemizedlist>\r
- <listitem>\r
- <simpara>\r
- For a LEFT OUTER JOIN, the join returns all of the rows from the left-hand\r
- table and the rows matching the join condition from the right-hand table, with\r
- NULL values for the rows with no match in the right-hand table.\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- A RIGHT OUTER JOIN behaves in the same way as a LEFT OUTER JOIN, with the\r
- exception that all rows are returned from the right-hand table participating in\r
- the join.\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- For a FULL OUTER JOIN, the join returns all the rows from both the left-hand\r
- and right-hand tables, with NULL values for the rows with no match in either\r
- the left-hand or right-hand table.\r
- </simpara>\r
- </listitem>\r
- </itemizedlist>\r
- <formalpara><title>Base tables for the OUTER JOIN examples</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT * FROM aaa;\r
-\r
- id | stuff\r
-----+-------\r
- 1 | one\r
- 2 | two\r
- 3 | three\r
- 4 | four\r
- 5 | five\r
-(5 rows)\r
-\r
-SELECT * FROM bbb;\r
-\r
- id | stuff | foo\r
-----+-------+----------\r
- 1 | one | oneone\r
- 2 | two | twotwo\r
- 5 | five | fivefive\r
- 6 | six | sixsix\r
-(4 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <formalpara><title>Example of a LEFT OUTER JOIN</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT * FROM aaa\r
- LEFT OUTER JOIN bbb ON aaa.id = bbb.id\r
-;\r
- id | stuff | id | stuff | foo\r
-----+-------+----+-------+----------\r
- 1 | one | 1 | one | oneone\r
- 2 | two | 2 | two | twotwo\r
- 3 | three | | |\r
- 4 | four | | |\r
- 5 | five | 5 | five | fivefive\r
-(5 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <formalpara><title>Example of a RIGHT OUTER JOIN</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT * FROM aaa\r
- RIGHT OUTER JOIN bbb ON aaa.id = bbb.id\r
-;\r
- id | stuff | id | stuff | foo\r
-----+-------+----+-------+----------\r
- 1 | one | 1 | one | oneone\r
- 2 | two | 2 | two | twotwo\r
- 5 | five | 5 | five | fivefive\r
- | | 6 | six | sixsix\r
-(4 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <formalpara><title>Example of a FULL OUTER JOIN</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT * FROM aaa\r
- FULL OUTER JOIN bbb ON aaa.id = bbb.id\r
-;\r
- id | stuff | id | stuff | foo\r
-----+-------+----+-------+----------\r
- 1 | one | 1 | one | oneone\r
- 2 | two | 2 | two | twotwo\r
- 3 | three | | |\r
- 4 | four | | |\r
- 5 | five | 5 | five | fivefive\r
- | | 6 | six | sixsix\r
-(6 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_self_joins">\r
- <title>Self joins</title>\r
- <simpara>It is possible to join a table to itself. You can, in fact you must, use\r
- aliases to disambiguate the references to the table.</simpara>\r
- </simplesect>\r
- </simplesect>\r
- <simplesect id="_set_operations">\r
- <title>Set operations</title>\r
- <simpara>Relational databases are effectively just an efficient mechanism for\r
- manipulating sets of values; they are implementations of set theory. There are\r
- three operators for sets (tables) in which each set must have the same number\r
- of columns with compatible data types: the union, intersection, and difference\r
- operators.</simpara>\r
- <formalpara><title>Base tables for the set operation examples</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT * FROM aaa;\r
-\r
- id | stuff\r
- ----+-------\r
- 1 | one\r
- 2 | two\r
- 3 | three\r
- 4 | four\r
- 5 | five\r
- (5 rows)\r
-\r
-SELECT * FROM bbb;\r
-\r
- id | stuff | foo\r
- ----+-------+----------\r
- 1 | one | oneone\r
- 2 | two | twotwo\r
- 5 | five | fivefive\r
- 6 | six | sixsix\r
-(4 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simplesect id="_union">\r
- <title>Union</title>\r
- <simpara>The <literal>UNION</literal> operator returns the distinct set of rows that are members of\r
- either or both of the left-hand and right-hand tables. The <literal>UNION</literal> operator\r
- does not return any duplicate rows. To return duplicate rows, use the\r
- <literal>UNION ALL</literal> operator.</simpara>\r
- <formalpara><title>Example of a UNION set operation</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
--- The parentheses are not required, but are intended to help\r
--- illustrate the sets participating in the set operation\r
-(\r
- SELECT id, stuff\r
- FROM aaa\r
-)\r
-UNION\r
-(\r
- SELECT id, stuff\r
- FROM bbb\r
-)\r
-ORDER BY 1\r
-;\r
-\r
- id | stuff\r
-----+-------\r
- 1 | one\r
- 2 | two\r
- 3 | three\r
- 4 | four\r
- 5 | five\r
- 6 | six\r
-(6 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_intersection">\r
- <title>Intersection</title>\r
- <simpara>The <literal>INTERSECT</literal> operator returns the distinct set of rows that are common to\r
- both the left-hand and right-hand tables. To return duplicate rows, use the\r
- <literal>INTERSECT ALL</literal> operator.</simpara>\r
- <formalpara><title>Example of an INTERSECT set operation</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-(\r
- SELECT id, stuff\r
- FROM aaa\r
-)\r
-INTERSECT\r
-(\r
- SELECT id, stuff\r
- FROM bbb\r
-)\r
-ORDER BY 1\r
-;\r
-\r
- id | stuff\r
-----+-------\r
- 1 | one\r
- 2 | two\r
- 5 | five\r
-(3 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_difference">\r
- <title>Difference</title>\r
- <simpara>The <literal>EXCEPT</literal> operator returns the rows in the left-hand table that do not\r
- exist in the right-hand table. You are effectively subtracting the common\r
- rows from the left-hand table.</simpara>\r
- <formalpara><title>Example of an EXCEPT set operation</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-(\r
- SELECT id, stuff\r
- FROM aaa\r
-)\r
-EXCEPT\r
-(\r
- SELECT id, stuff\r
- FROM bbb\r
-)\r
-ORDER BY 1\r
-;\r
-\r
- id | stuff\r
-----+-------\r
- 3 | three\r
- 4 | four\r
-(2 rows)\r
-\r
--- Order matters: switch the left-hand and right-hand tables\r
--- and you get a different result\r
-(\r
- SELECT id, stuff\r
- FROM bbb\r
-)\r
-EXCEPT\r
-(\r
- SELECT id, stuff\r
- FROM aaa\r
-)\r
-ORDER BY 1\r
-;\r
-\r
- id | stuff\r
-----+-------\r
- 6 | six\r
-(1 row)\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- </simplesect>\r
- <simplesect id="_views">\r
- <title>Views</title>\r
- <simpara>A view is a persistent <literal>SELECT</literal> statement that acts like a read-only table.\r
- To create a view, issue the <literal>CREATE VIEW</literal> statement, giving the view a name\r
- and a <literal>SELECT</literal> statement on which the view is built.</simpara>\r
- <simpara>The following example creates a view based on our borrower profile count:</simpara>\r
- <formalpara><title>Creating a view</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-CREATE VIEW actor.borrower_profile_count AS\r
- SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
- FROM actor.usr au\r
- INNER JOIN permission.grp_tree pgt\r
- ON au.profile = pgt.id\r
- INNER JOIN actor.org_unit aou\r
- ON aou.id = au.home_ou\r
- WHERE au.deleted IS FALSE\r
- GROUP BY pgt.name, aou.name\r
- ORDER BY aou.name, pgt.name\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>When you subsequently select results from the view, you can apply additional\r
- <literal>WHERE</literal> clauses to filter the results, or <literal>ORDER BY</literal> clauses to change the\r
- order of the returned rows. In the following examples, we issue a simple\r
- <literal>SELECT *</literal> statement to show that the default results are returned in the\r
- same order from the view as the equivalent SELECT statement would be returned.\r
- Then we issue a <literal>SELECT</literal> statement with a <literal>WHERE</literal> clause to further filter the\r
- results.</simpara>\r
- <formalpara><title>Selecting results from a view</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT * FROM actor.borrower_profile_count;\r
-\r
- Profile | Library | Count\r
-----------------------------+----------------------------+-------\r
- Faculty | University Library | 208\r
- Graduate | University Library | 16\r
- Patrons | University Library | 62\r
-...\r
-\r
--- You can still filter your results with WHERE clauses\r
-SELECT *\r
- FROM actor.borrower_profile_count\r
- WHERE "Profile" = 'Faculty';\r
-\r
- Profile | Library | Count\r
----------+----------------------------+-------\r
- Faculty | University Library | 208\r
- Faculty | College Library | 64\r
- Faculty | College Library 2 | 102\r
- Faculty | University Library 2 | 776\r
-(4 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_inheritance">\r
- <title>Inheritance</title>\r
- <simpara>PostgreSQL supports table inheritance: that is, a child table inherits its\r
- base definition from a parent table, but can add additional columns to its\r
- own definition. The data from any child tables is visible in queries against\r
- the parent table.</simpara>\r
- <simpara>Evergreen uses table inheritance in several areas:\r
- * In the Vandelay MARC batch importer / exporter, Evergreen defines base\r
- tables for generic queues and queued records for which authority record and\r
- bibliographic record child tables\r
- * Billable transactions are based on the <literal>money.billable_xact</literal> table;\r
- child tables include <literal>action.circulation</literal> for circulation transactions\r
- and <literal>money.grocery</literal> for general bills.\r
- * Payments are based on the <literal>money.payment</literal> table; its child table is\r
- <literal>money.bnm_payment</literal> (for brick-and-mortar payments), which in turn has child\r
- tables of <literal>money.forgive_payment</literal>, <literal>money.work_payment</literal>, <literal>money.credit_payment</literal>,\r
- <literal>money.goods_payment</literal>, and <literal>money.bnm_desk_payment</literal>. The\r
- <literal>money.bnm_desk_payment</literal> table in turn has child tables of <literal>money.cash_payment</literal>,\r
- <literal>money.check_payment</literal>, and <literal>money.credit_card_payment</literal>.\r
- * Transits are based on the <literal>action.transit_copy</literal> table, which has a child\r
- table of <literal>action.hold_transit_copy</literal> for transits initiated by holds.\r
- * Generic acquisition line items are defined by the\r
- <literal>acq.lineitem_attr_definition</literal> table, which in turn has a number of child\r
- tables to define MARC attributes, generated attributes, user attributes, and\r
- provider attributes.</simpara>\r
- </simplesect>\r
- </section>\r
- <section id="understanding_query_performance_with_explain">\r
- <title>Understanding query performance with EXPLAIN</title>\r
- <simpara>Some queries run for a long, long time. This can be the result of a poorly\r
- written query—a query with a join condition that joins every\r
- row in the <literal>biblio.record_entry</literal> table with every row in the <literal>metabib.full_rec</literal>\r
- view would consume a massive amount of memory and disk space and CPU time—or\r
- a symptom of a schema that needs some additional indexes. PostgreSQL provides\r
- the <literal>EXPLAIN</literal> tool to estimate how long it will take to run a given query and\r
- show you the <emphasis>query plan</emphasis> (how it plans to retrieve the results from the\r
- database).</simpara>\r
- <simpara>To generate the query plan without actually running the statement, simply\r
- prepend the <literal>EXPLAIN</literal> keyword to your query. In the following example, we\r
- generate the query plan for the poorly written query that would join every\r
- row in the <literal>biblio.record_entry</literal> table with every row in the <literal>metabib.full_rec</literal>\r
- view:</simpara>\r
- <formalpara><title>Query plan for a terrible query</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-EXPLAIN SELECT *\r
- FROM biblio.record_entry\r
- FULL OUTER JOIN metabib.full_rec ON 1=1\r
-;\r
-\r
- QUERY PLAN\r
--------------------------------------------------------------------------------//\r
- Merge Full Join (cost=0.00..4959156437783.60 rows=132415734100864 width=1379)\r
- -> Seq Scan on record_entry (cost=0.00..400634.16 rows=2013416 width=1292)\r
- -> Seq Scan on real_full_rec (cost=0.00..1640972.04 rows=65766704 width=87)\r
-(3 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>This query plan shows that the query would return 132415734100864 rows, and it\r
- plans to accomplish what you asked for by sequentially scanning (<emphasis>Seq Scan</emphasis>)\r
- every row in each of the tables participating in the join.</simpara>\r
- <simpara>In the following example, we have realized our mistake in joining every row of\r
- the left-hand table with every row in the right-hand table and take the saner\r
- approach of using an <literal>INNER JOIN</literal> where the join condition is on the record ID.</simpara>\r
- <formalpara><title>Query plan for a less terrible query</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-EXPLAIN SELECT *\r
- FROM biblio.record_entry bre\r
- INNER JOIN metabib.full_rec mfr ON mfr.record = bre.id;\r
- QUERY PLAN\r
-----------------------------------------------------------------------------------------//\r
- Hash Join (cost=750229.86..5829273.98 rows=65766704 width=1379)\r
- Hash Cond: (real_full_rec.record = bre.id)\r
- -> Seq Scan on real_full_rec (cost=0.00..1640972.04 rows=65766704 width=87)\r
- -> Hash (cost=400634.16..400634.16 rows=2013416 width=1292)\r
- -> Seq Scan on record_entry bre (cost=0.00..400634.16 rows=2013416 width=1292)\r
-(5 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>This time, we will return 65766704 rows - still way too many rows. We forgot\r
- to include a <literal>WHERE</literal> clause to limit the results to something meaningful. In\r
- the following example, we will limit the results to deleted records that were\r
- modified in the last month.</simpara>\r
- <formalpara><title>Query plan for a realistic query</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-EXPLAIN SELECT *\r
- FROM biblio.record_entry bre\r
- INNER JOIN metabib.full_rec mfr ON mfr.record = bre.id\r
- WHERE bre.deleted IS TRUE\r
- AND DATE_TRUNC('MONTH', bre.edit_date) >\r
- DATE_TRUNC ('MONTH', NOW() - '1 MONTH'::INTERVAL)\r
-;\r
-\r
- QUERY PLAN\r
-----------------------------------------------------------------------------------------//\r
- Hash Join (cost=5058.86..2306218.81 rows=201669 width=1379)\r
- Hash Cond: (real_full_rec.record = bre.id)\r
- -> Seq Scan on real_full_rec (cost=0.00..1640972.04 rows=65766704 width=87)\r
- -> Hash (cost=4981.69..4981.69 rows=6174 width=1292)\r
- -> Index Scan using biblio_record_entry_deleted on record_entry bre\r
- (cost=0.00..4981.69 rows=6174 width=1292)\r
- Index Cond: (deleted = true)\r
- Filter: ((deleted IS TRUE) AND (date_trunc('MONTH'::text, edit_date)\r
- > date_trunc('MONTH'::text, (now() - '1 mon'::interval))))\r
-(7 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>We can see that the number of rows returned is now only 201669; that’s\r
- something we can work with. Also, the overall cost of the query is 2306218,\r
- compared to 4959156437783 in the original query. The <literal>Index Scan</literal> tells us\r
- that the query planner will use the index that was defined on the <literal>deleted</literal>\r
- column to avoid having to check every row in the <literal>biblio.record_entry</literal> table.</simpara>\r
- <simpara>However, we are still running a sequential scan over the\r
- <literal>metabib.real_full_rec</literal> table (the table on which the <literal>metabib.full_rec</literal>\r
- view is based). Given that linking from the bibliographic records to the\r
- flattened MARC subfields is a fairly common operation, we could create a\r
- new index and see if that speeds up our query plan.</simpara>\r
- <formalpara><title>Query plan with optimized access via a new index</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
--- This index will take a long time to create on a large database\r
--- of bibliographic records\r
-CREATE INDEX bib_record_idx ON metabib.real_full_rec (record);\r
-\r
-EXPLAIN SELECT *\r
- FROM biblio.record_entry bre\r
- INNER JOIN metabib.full_rec mfr ON mfr.record = bre.id\r
- WHERE bre.deleted IS TRUE\r
- AND DATE_TRUNC('MONTH', bre.edit_date) >\r
- DATE_TRUNC ('MONTH', NOW() - '1 MONTH'::INTERVAL)\r
-;\r
-\r
- QUERY PLAN\r
-----------------------------------------------------------------------------------------//\r
- Nested Loop (cost=0.00..1558330.46 rows=201669 width=1379)\r
- -> Index Scan using biblio_record_entry_deleted on record_entry bre\r
- (cost=0.00..4981.69 rows=6174 width=1292)\r
- Index Cond: (deleted = true)\r
- Filter: ((deleted IS TRUE) AND (date_trunc('MONTH'::text, edit_date) >\r
- date_trunc('MONTH'::text, (now() - '1 mon'::interval))))\r
- -> Index Scan using bib_record_idx on real_full_rec\r
- (cost=0.00..240.89 rows=850 width=87)\r
- Index Cond: (real_full_rec.record = bre.id)\r
-(6 rows)\r
-</programlisting>\r
- </para></formalpara>\r
- <simpara>We can see that the resulting number of rows is still the same (201669), but\r
- the execution estimate has dropped to 1558330 because the query planner can\r
- use the new index (<literal>bib_record_idx</literal>) rather than scanning the entire table.\r
- Success!</simpara>\r
- <note><simpara>While indexes can significantly speed up read access to tables for common\r
- filtering conditions, every time a row is created or updated the corresponding\r
- indexes also need to be maintained - which can decrease the performance of\r
- writes to the database. Be careful to keep the balance of read performance\r
- versus write performance in mind if you plan to create custom indexes in your\r
- Evergreen database.</simpara></note>\r
- </section>\r
- <section id="inserting_updating_and_deleting_data">\r
- <title>Inserting, updating, and deleting data</title>\r
- <simplesect id="_inserting_data">\r
- <title>Inserting data</title>\r
- <simpara>To insert one or more rows into a table, use the INSERT statement to identify\r
- the target table and list the columns in the table for which you are going to\r
- provide values for each row. If you do not list one or more columns contained\r
- in the table, the database will automatically supply a <literal>NULL</literal> value for those\r
- columns. The values for each row follow the <literal>VALUES</literal> clause and are grouped in\r
- parentheses and delimited by commas. Each row, in turn, is delimited by commas\r
- (<emphasis>this multiple row syntax requires PostgreSQL 8.2 or higher</emphasis>).</simpara>\r
- <simpara>For example, to insert two rows into the <literal>permission.usr_grp_map</literal> table:</simpara>\r
- <formalpara><title>Inserting rows into the <literal>permission.usr_grp_map</literal> table</title><para>\r
- <programlisting language="sql" linenumbering="unnumbered">INSERT INTO permission.usr_grp_map (usr, grp)\r
- VALUES (2, 10), (2, 4)\r
- ;</programlisting>\r
- </para></formalpara>\r
- <simpara>Of course, as with the rest of SQL, you can replace individual column values\r
- with one or more use sub-selects:</simpara>\r
- <formalpara><title>Inserting rows using sub-selects instead of integers</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-INSERT INTO permission.usr_grp_map (usr, grp)\r
- VALUES (\r
- (SELECT id FROM actor.usr\r
- WHERE family_name = 'Scott' AND first_given_name = 'Daniel'),\r
- (SELECT id FROM permission.grp_tree\r
- WHERE name = 'Local System Administrator')\r
- ), (\r
- (SELECT id FROM actor.usr\r
- WHERE family_name = 'Scott' AND first_given_name = 'Daniel'),\r
- (SELECT id FROM permission.grp_tree\r
- WHERE name = 'Circulator')\r
- )\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_inserting_data_using_a_select_statement">\r
- <title>Inserting data using a SELECT statement</title>\r
- <simpara>Sometimes you want to insert a bulk set of data into a new table based on\r
- a query result. Rather than a <literal>VALUES</literal> clause, you can use a <literal>SELECT</literal>\r
- statement to insert one or more rows matching the column definitions. This\r
- is a good time to point out that you can include explicit values, instead\r
- of just column identifiers, in the return columns of the <literal>SELECT</literal> statement.\r
- The explicit values are returned in every row of the result set.</simpara>\r
- <simpara>In the following example, we insert 6 rows into the <literal>permission.usr_grp_map</literal>\r
- table; each row will have a <literal>usr</literal> column value of 1, with varying values for\r
- the <literal>grp</literal> column value based on the <literal>id</literal> column values returned from\r
- <literal>permission.grp_tree</literal>:</simpara>\r
- <formalpara><title>Inserting rows via a <literal>SELECT</literal> statement</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-INSERT INTO permission.usr_grp_map (usr, grp)\r
- SELECT 1, id\r
- FROM permission.grp_tree\r
- WHERE id > 2\r
-;\r
-\r
-INSERT 0 6\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_deleting_rows">\r
- <title>Deleting rows</title>\r
- <simpara>Deleting data from a table is normally fairly easy. To delete rows from a table,\r
- issue a <literal>DELETE</literal> statement identifying the table from which you want to delete\r
- rows and a <literal>WHERE</literal> clause identifying the row or rows that should be deleted.</simpara>\r
- <simpara>In the following example, we delete all of the rows from the\r
- <literal>permission.grp_perm_map</literal> table where the permission maps to\r
- <literal>UPDATE_ORG_UNIT_CLOSING</literal> and the group is anything other than administrators:</simpara>\r
- <formalpara><title>Deleting rows from a table</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-DELETE FROM permission.grp_perm_map\r
- WHERE grp IN (\r
- SELECT id\r
- FROM permission.grp_tree\r
- WHERE name != 'Local System Administrator'\r
- ) AND perm = (\r
- SELECT id\r
- FROM permission.perm_list\r
- WHERE code = 'UPDATE_ORG_UNIT_CLOSING'\r
- )\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- <note><simpara>There are two main reasons that a <literal>DELETE</literal> statement may not actually\r
- delete rows from a table, even when the rows meet the conditional clause.</simpara></note>\r
- <orderedlist numeration="arabic">\r
- <listitem>\r
- <simpara>\r
- If the row contains a value that is the target of a relational constraint,\r
- for example, if another table has a foreign key pointing at your target\r
- table, you will be prevented from deleting a row with a value corresponding\r
- to a row in the dependent table.\r
- </simpara>\r
- </listitem>\r
- <listitem>\r
- <simpara>\r
- If the table has a rule that substitutes a different action for a <literal>DELETE</literal>\r
- statement, the deletion will not take place. In Evergreen it is common for a\r
- table to have a rule that substitutes the action of setting a <literal>deleted</literal> column\r
- to <literal>TRUE</literal>. For example, if a book is discarded, deleting the row representing\r
- the copy from the <literal>asset.copy</literal> table would severely affect circulation statistics,\r
- bills, borrowing histories, and their corresponding tables in the database that\r
- have foreign keys pointing at the <literal>asset.copy</literal> table (<literal>action.circulation</literal> and\r
- <literal>money.billing</literal> and its children respectively). Instead, the <literal>deleted</literal> column\r
- value is set to <literal>TRUE</literal> and Evergreen’s application logic skips over these rows\r
- in most cases.\r
- </simpara>\r
- </listitem>\r
- </orderedlist>\r
- </simplesect>\r
- <simplesect id="_updating_rows">\r
- <title>Updating rows</title>\r
- <simpara>To update rows in a table, issue an <literal>UPDATE</literal> statement identifying the table\r
- you want to update, the column or columns that you want to set with their\r
- respective new values, and (optionally) a <literal>WHERE</literal> clause identifying the row or\r
- rows that should be updated.</simpara>\r
- <simpara>Following is the syntax for the <literal>UPDATE</literal> statement:</simpara>\r
- <blockquote>\r
- <literallayout><literal>UPDATE</literal> [<emphasis>table-name</emphasis>]\r
- <literal>SET</literal> [<emphasis>column</emphasis>] <literal>TO</literal> [<emphasis>new-value</emphasis>]\r
- <literal>WHERE</literal> [<emphasis>condition</emphasis>]\r
- ;</literallayout>\r
- </blockquote>\r
- </simplesect>\r
- </section>\r
- <section id="query_requests">\r
- <title>Query requests</title>\r
- <simpara>The following queries were requested by Bibliomation, but might be reusable\r
- by other libraries.</simpara>\r
- <simplesect id="_monthly_circulation_stats_by_collection_code_library">\r
- <title>Monthly circulation stats by collection code / library</title>\r
- <formalpara><title>Monthly Circulation Stats by Collection Code/Library</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT COUNT(acirc.id) AS "COUNT", aou.name AS "Library", acl.name AS "Copy Location"\r
- FROM asset.copy ac\r
- INNER JOIN asset.copy_location acl ON ac.location = acl.id\r
- INNER JOIN action.circulation acirc ON acirc.target_copy = ac.id\r
- INNER JOIN actor.org_unit aou ON acirc.circ_lib = aou.id\r
- WHERE DATE_TRUNC('MONTH', acirc.create_time) = DATE_TRUNC('MONTH', NOW() - INTERVAL '3 month')\r
- AND acirc.desk_renewal IS FALSE\r
- AND acirc.opac_renewal IS FALSE\r
- AND acirc.phone_renewal IS FALSE\r
- GROUP BY aou.name, acl.name\r
- ORDER BY aou.name, acl.name, 1\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_monthly_circulation_stats_by_borrower_stat_library">\r
- <title>Monthly circulation stats by borrower stat / library</title>\r
- <formalpara><title>Monthly Circulation Stats by Borrower Stat/Library</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT COUNT(acirc.id) AS "COUNT", aou.name AS "Library", asceum.stat_cat_entry AS "Borrower Stat"\r
- FROM action.circulation acirc\r
- INNER JOIN actor.org_unit aou ON acirc.circ_lib = aou.id\r
- INNER JOIN actor.stat_cat_entry_usr_map asceum ON asceum.target_usr = acirc.usr\r
- INNER JOIN actor.stat_cat astat ON asceum.stat_cat = astat.id\r
- WHERE DATE_TRUNC('MONTH', acirc.create_time) = DATE_TRUNC('MONTH', NOW() - INTERVAL '3 month')\r
- AND astat.name = 'Preferred language'\r
- AND acirc.desk_renewal IS FALSE\r
- AND acirc.opac_renewal IS FALSE\r
- AND acirc.phone_renewal IS FALSE\r
- GROUP BY aou.name, asceum.stat_cat_entry\r
- ORDER BY aou.name, asceum.stat_cat_entry, 1\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_monthly_intralibrary_loan_stats_by_library">\r
- <title>Monthly intralibrary loan stats by library</title>\r
- <formalpara><title>Monthly Intralibrary Loan Stats by Library</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT aou.name AS "Library", COUNT(acirc.id)\r
- FROM action.circulation acirc\r
- INNER JOIN actor.org_unit aou ON acirc.circ_lib = aou.id\r
- INNER JOIN asset.copy ac ON acirc.target_copy = ac.id\r
- INNER JOIN asset.call_number acn ON ac.call_number = acn.id\r
- WHERE acirc.circ_lib != acn.owning_lib\r
- AND DATE_TRUNC('MONTH', acirc.create_time) = DATE_TRUNC('MONTH', NOW() - INTERVAL '3 month')\r
- AND acirc.desk_renewal IS FALSE\r
- AND acirc.opac_renewal IS FALSE\r
- AND acirc.phone_renewal IS FALSE\r
- GROUP by aou.name\r
- ORDER BY aou.name, 2\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_monthly_borrowers_added_by_profile_adult_child_etc_library">\r
- <title>Monthly borrowers added by profile (adult, child, etc) / library</title>\r
- <formalpara><title>Monthly Borrowers Added by Profile (Adult, Child, etc)/Library</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
- FROM actor.usr au\r
- INNER JOIN permission.grp_tree pgt\r
- ON au.profile = pgt.id\r
- INNER JOIN actor.org_unit aou\r
- ON aou.id = au.home_ou\r
- WHERE au.deleted IS FALSE\r
- AND DATE_TRUNC('MONTH', au.create_date) = DATE_TRUNC('MONTH', NOW() - '3 months'::interval)\r
- GROUP BY pgt.name, aou.name\r
- ORDER BY aou.name, pgt.name\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_borrower_count_by_profile_adult_child_etc_library">\r
- <title>Borrower count by profile (adult, child, etc) / library</title>\r
- <formalpara><title>Borrower Count by Profile (Adult, Child, etc)/Library</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
- FROM actor.usr au\r
- INNER JOIN permission.grp_tree pgt\r
- ON au.profile = pgt.id\r
- INNER JOIN actor.org_unit aou\r
- ON aou.id = au.home_ou\r
- WHERE au.deleted IS FALSE\r
- GROUP BY pgt.name, aou.name\r
- ORDER BY aou.name, pgt.name\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_monthly_items_added_by_collection_library">\r
- <title>Monthly items added by collection / library</title>\r
- <simpara>We define a <quote>collection</quote> as a shelving location in Evergreen.</simpara>\r
- <formalpara><title>Monthly Items Added by Collection/Library</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-SELECT aou.name AS "Library", acl.name, COUNT(ac.barcode)\r
- FROM actor.org_unit aou\r
- INNER JOIN asset.call_number acn ON acn.owning_lib = aou.id\r
- INNER JOIN asset.copy ac ON ac.call_number = acn.id\r
- INNER JOIN asset.copy_location acl ON ac.location = acl.id\r
- WHERE ac.deleted IS FALSE\r
- AND acn.deleted IS FALSE\r
- AND DATE_TRUNC('MONTH', ac.create_date) = DATE_TRUNC('MONTH', NOW() - '1 month'::interval)\r
- GROUP BY aou.name, acl.name\r
- ORDER BY aou.name, acl.name\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_hold_purchase_alert_by_library">\r
- <title>Hold purchase alert by library</title>\r
- <simpara>in the following set of queries, we bring together the active title, volume,\r
- and copy holds and display those that have more than a certain number of holds\r
- per title. The goal is to UNION ALL the three queries, then group by the\r
- bibliographic record ID and display the title / author information for those\r
- records that have more than a given threshold of holds.</simpara>\r
- <formalpara><title>Hold Purchase Alert by Library</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
--- Title holds\r
-SELECT all_holds.bib_id, aou.name, rmsr.title, rmsr.author, COUNT(all_holds.bib_id)\r
- FROM\r
- (\r
- (\r
- SELECT target, request_lib\r
- FROM action.hold_request\r
- WHERE hold_type = 'T'\r
- AND fulfillment_time IS NULL\r
- AND cancel_time IS NULL\r
- )\r
- UNION ALL\r
- -- Volume holds\r
- (\r
- SELECT bre.id, request_lib\r
- FROM action.hold_request ahr\r
- INNER JOIN asset.call_number acn ON ahr.target = acn.id\r
- INNER JOIN biblio.record_entry bre ON acn.record = bre.id\r
- WHERE ahr.hold_type = 'V'\r
- AND ahr.fulfillment_time IS NULL\r
- AND ahr.cancel_time IS NULL\r
- )\r
- UNION ALL\r
- -- Copy holds\r
- (\r
- SELECT bre.id, request_lib\r
- FROM action.hold_request ahr\r
- INNER JOIN asset.copy ac ON ahr.target = ac.id\r
- INNER JOIN asset.call_number acn ON ac.call_number = acn.id\r
- INNER JOIN biblio.record_entry bre ON acn.record = bre.id\r
- WHERE ahr.hold_type = 'C'\r
- AND ahr.fulfillment_time IS NULL\r
- AND ahr.cancel_time IS NULL\r
- )\r
- ) AS all_holds(bib_id, request_lib)\r
- INNER JOIN reporter.materialized_simple_record rmsr\r
- INNER JOIN actor.org_unit aou ON aou.id = all_holds.request_lib\r
- ON rmsr.id = all_holds.bib_id\r
- GROUP BY all_holds.bib_id, aou.name, rmsr.id, rmsr.title, rmsr.author\r
- HAVING COUNT(all_holds.bib_id) > 2\r
- ORDER BY aou.name\r
-;\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- <simplesect id="_update_borrower_records_with_a_different_home_library">\r
- <title>Update borrower records with a different home library</title>\r
- <simpara>In this example, the library has opened a new branch in a growing area,\r
- and wants to reassign the home library for the patrons in the vicinity of\r
- the new branch to the new branch. To accomplish this, we create a staging table\r
- that holds a set of city names and the corresponding branch shortname for the home\r
- library for each city.</simpara>\r
- <simpara>Then we issue an <literal>UPDATE</literal> statement to set the home library for patrons with a\r
- physical address with a city that matches the city names in our staging table.</simpara>\r
- <formalpara><title>Update borrower records with a different home library</title><para>\r
-<programlisting language="sql" linenumbering="unnumbered">\r
-CREATE SCHEMA staging;\r
-CREATE TABLE staging.city_home_ou_map (city TEXT, ou_shortname TEXT,\r
- FOREIGN KEY (ou_shortname) REFERENCES actor.org_unit (shortname));\r
-INSERT INTO staging.city_home_ou_map (city, ou_shortname)\r
- VALUES ('Southbury', 'BR1'), ('Middlebury', 'BR2'), ('Hartford', 'BR3');\r
-BEGIN;\r
-\r
-UPDATE actor.usr au SET home_ou = COALESCE(\r
- (\r
- SELECT aou.id\r
- FROM actor.org_unit aou\r
- INNER JOIN staging.city_home_ou_map schom ON schom.ou_shortname = aou.shortname\r
- INNER JOIN actor.usr_address aua ON aua.city = schom.city\r
- WHERE au.id = aua.usr\r
- GROUP BY aou.id\r
- ), home_ou)\r
-WHERE (\r
- SELECT aou.id\r
- FROM actor.org_unit aou\r
- INNER JOIN staging.city_home_ou_map schom ON schom.ou_shortname = aou.shortname\r
- INNER JOIN actor.usr_address aua ON aua.city = schom.city\r
- WHERE au.id = aua.usr\r
- GROUP BY aou.id\r
-) IS NOT NULL;\r
-</programlisting>\r
- </para></formalpara>\r
- </simplesect>\r
- </section>\r
- \r
-</chapter>\r