Here’s
a version of the query that completes in about 1500 ms on our database, or
about 0.017 ms per record, which, if I’m right, should shave an additional
couple of seconds off for you (7650 ms):
SELECT
rm.record_type_code || rm.record_num AS
record_num,
email.field_content AS email,
pname.last_name || ', ' || pname.first_name
|| ' ' || pname.middle_name AS name,
pr.home_library_code
FROM
sierra_view.record_metadata rm
JOIN sierra_view.patron_record pr
ON rm.record_type_code = 'p' AND rm.id =
pr.record_id
JOIN sierra_view.varfield email
ON email.varfield_type_code = 'z' AND
email.record_id = pr.record_id
JOIN sierra_view.patron_record_fullname
pname
ON pname.patron_record_id = pr.record_id
WHERE
email.field_content !~ '[^@]+@[^@]+[.][^@]+';
The
main difference here between this one and yours and Joe’s is that it doesn’t
use any of the *_view tables. Those are constructed views that III created for
convenience, but in my experience they’re almost always slower than just
picking whichever tables you need and joining manually. Of course, 7-8 seconds
is still a very long response time for a UI, so doing something like what
Joshua suggested may still be necessary.
As an
aside, if you’re not familiar with using EXPLAIN and EXPLAIN ANALYZE, these are
very helpful tools for figuring out what’s going on in a particular query.
Their output is a bit cryptic, but there are several helpful
tutorials/explanations available. I always have to refer back to the Postgresql
documentation, here:
If
you’re using pg-admin, it can give you a nice visual representation (in the
Query window, go to Query / Explain and Query / Explain Analyze) that’s a bit
easier to parse than the raw output.
Good
luck!
Jason
Thomale
User
Interfaces Unit, UNT Libraries
From: Boggs, John
Sent: Wednesday, February 25, 2015
12:07 PM
To: Sierra Users List
Subject: [sierra] Speed up an SQL query?
To: Sierra Users List
Subject: [sierra] Speed up an SQL query?
Hi all,
I'm
trying to put together a canned report for my circ folks of badly-formed email
addresses. The report takes about 15 seconds to run, which isn't really all
that long but it does lead folks waiting for a web page to click on the link
again thinking maybe their click didn't work -- and that just causes the query
to start all over again. I have a much more complex query, over much more data,
for items in transit too long that runs in about 3 seconds and I'm puzzled as
to why the patron one would run so much slower. We have 2.5 million items
and only 450,000 patrons.
I'm
attaching the query in case anybody sees anything I'm missing. I have tried
selecting from each of the three tables that I'm using and joining in a
different orders, but it all takes at least 15 seconds.
Thanks
for any suggestions!
SELECT
'p' || pat_view.record_num
|| 'a',
email.field_content,
name.last_name ||
',' || name.first_name || ' ' || name.middle_name,
pat_view.home_library_code
FROM
sierra_view.patron_view AS pat_view
JOIN
sierra_view.varfield_view AS email
ON (
pat_view.record_num = email.record_num AND
email.record_type_code = 'p' AND
email.varfield_type_code = 'z' )
JOIN
sierra_view.patron_record_fullname AS name
ON (pat_view.id =
name.patron_record_id)
WHERE
email.field_content !~ '[^@]+@[^@]+[.][^@]+';
No comments:
Post a Comment