PostgreSQL query performance issue – Database Administrators Stack Exchange

We are sometimes getting poor performance (~14s) when using the following query (PostgreSQL 9.6) to fetch rows from the table items whose ID is present in the table items_categories:

SELECT items.*
FROM items 
WHERE EXISTS (
    SELECT item_id 
    FROM items_categories 
    WHERE item_id = items.id  AND category_id = 626 
) 
AND items.active = TRUE
-- possibly some others "AND" here to use more filters on "items", but not considered for this question
ORDER BY modified_at DESC 
LIMIT 10

Relevant parts of our schema:

                              Table "public.items"
           Column      |       Type        |                     Modifiers
-----------------------+-------------------+----------------------------------------------------
 id                    | integer           | not null default nextval('items_id_seq'::regclass)
 active                | boolean           | default true
 modified_at           | timestamp without time zone | default now()
Indexes:
    "items_pkey" PRIMARY KEY, btree (id)
    "active_idx" btree (active)
    "aggregate_idx" btree (id)
    "items_modified_at_idx" btree (modified_at)


  Table "public.items_categories"
   Column    |  Type   | Modifiers
-------------+---------+-----------
 item_id     | integer | not null
 category_id | integer | not null
Indexes:
    "unique_cat_item_assoc" UNIQUE CONSTRAINT, btree (item_id, category_id)
    "items_categories_1_idx" btree (category_id)
    "items_categories_2_idx" btree (item_id)
Foreign-key constraints:
    "items_categories_category_id_fkey" FOREIGN KEY (category_id) REFERENCES categories(id)
    "items_categories_item_id_fkey" FOREIGN KEY (item_id) REFERENCES items(id)

The table items contains ~2 M rows, and the table items_categories contains ~4 M rows

When we ask for 10 items (i.e. LIMIT 10 at the end of the above query) and 10 or more rows match in items_categories, the performance is good (~10ms), but when we ask for 10 items and less than 10 rows match in items_categories, then the query takes ~14s because it’s doing an index scan on items.modified_at to look at each 2 M rows.

Query plan when less than 10 rows match in items_categories (poor performance):

Limit  (cost=0.86..11696.68 rows=10 width=1797) (actual time=168.376..14484.854 rows=7 loops=1)
  ->  Nested Loop Semi Join  (cost=0.86..2746178.23 rows=2348 width=1797) (actual time=168.376..14484.836 rows=7 loops=1)
        ->  Index Scan Backward using items_modified_at_idx on items  (cost=0.43..1680609.95 rows=2243424 width=1797) (actual time=0.054..7611.300 rows=2251395 loops=1)
              Filter: active
              Rows Removed by Filter: 2467
        ->  Index Only Scan using unique_cat_item_assoc on items_categories  (cost=0.43..0.47 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=2251395)
              Index Cond: ((item_id = items.id) AND (category_id = 626))
              Heap Fetches: 7
Planning time: 3.082 ms
Execution time: 14485.057 ms

Query plan when more than 10 rows match in items_categories (good performance):

Limit  (cost=0.86..24.07 rows=10 width=1857) (actual time=3.575..3.757 rows=10 loops=1)
  ->  Nested Loop Semi Join  (cost=0.86..2763459.56 rows=1190819 width=1857) (actual time=3.574..3.752 rows=10 loops=1)
        ->  Index Scan Backward using items_modified_at_idx on items  (cost=0.43..1684408.22 rows=2246967 width=1857) (actual time=0.013..2.205 rows=751 loops=1)
              Filter: active
        ->  Index Only Scan using unique_cat_item_assoc on items_categories  (cost=0.43..0.47 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=751)
              Index Cond: ((item_id = items.id) AND (category_id = 20))
              Heap Fetches: 10
Planning time: 1.650 ms
Execution time: 3.868 ms

How can we tune this query to handle both situations? (i.e. good performances no matter how many rows of items_categories match).

I have a POC working where I first count the number of matching rows in items_categories (separate query) then if the number is low I use a CTE to work on a subset of items instead of all its rows, but it’s really a dirty temporary hack IMO…

Thank you!

Expose SQL server externally – Database Administrators Stack Exchange


Your privacy


By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.




linux – Central authentication – Database Administrators Stack Exchange


Your privacy


By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.




Kafka Newbie Question – Database Administrators Stack Exchange

Pls don’t give a negative vote to this, just coz this is a n00b question. I am trying to learn this new tech, and it looks promising.

My data source, producer, consumer are all on the same server (lab setup). And:

  1. I created a Kaka topic:
    bin/kafka-topics.sh --create --topic T1 --bootstrap-server localhost:9092

  2. I loaded many csv into the topic:
    bin/kafka-console-producer.sh --topic T1 --bootstrap-server localhost:9092 < FILENAME

  3. The csv are in many GB in size, and are in the format one row for one record. Format:
    Email,event_ID,timestamp,Description. All CSV files will always have a header-row

  4. I can see all data successfully , using:
    bin/kafka-console-consumer.sh --topic T1 --from-beginning --bootstrap-server localhost:9092

Question1 How can I search and return rows only where email address equals email@foo.com.

Question2 How can I search and return rows where email address equals “email@foo.com” AND event_ID equals “Z9284M”

I am aware I can grep the relevant rows. But that will be too slow & resource intensive. Is there a way for Kafka to index the data and search efficiently?

Would appreciate all help!

mysql – Room availability – Database Administrators Stack Exchange

MariaDB-10.5 has a feature Application Time Periods – WITHOUT OVERLAPS who’s example even lists availability for rooms as an example.

The example gives:

CREATE OR REPLACE TABLE rooms (
 room_number INT,
 guest_name VARCHAR(255),
 checkin DATE,
 checkout DATE,
 PERIOD FOR p(checkin,checkout),
 PRIMARY KEY (room_number, p WITHOUT OVERLAPS)
 );

As an example syntax.

Attempting to overlap a booking will result in an error:

INSERT INTO rooms VALUES 
 (1, 'Regina', '2020-10-01', '2020-10-03'),
 (2, 'Cochise', '2020-10-02', '2020-10-05'),
 (1, 'Nowell', '2020-10-03', '2020-10-07'),
 (2, 'Eusebius', '2020-10-04', '2020-10-06');
ERROR 1062 (23000): Duplicate entry '2-2020-10-06-2020-10-04' for key 'room_number'

So in your application you need to catch the duplicate key error as an exception and treat that as already booked.

Authentication with 2 databases – Database Administrators Stack Exchange


Your privacy


By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.




DB2 Linux authentication fails – Database Administrators Stack Exchange

I have DB2 Express-C v10.5 instance configured to authenticate against LDAP. The LDAP sever is going to be shutdown and I should configure the same DB2 instance to use Linux authentication.

I copied users from the LDAP  server to a local Linux host running DB2. Then I did shutdown the LDAP server. After that I changed DB2 authentication settings db2 update dbm cfg using SRVCON_PW_PLUGIN IBMOSauthserver (used to be IBMLDAPauthserver before) and restarted DB2.

Applications access the database with the username db2smth (name changed due to privacy reasons). I can connect to a database with db2 connect to dbname user db2inst1 using '********' but connecting to the same database as db2smth fails:

db2 => connect to dbname user db2smth using '********'

SQL30082N Security processing failed with reason "24" ("USERNAME AND/OR PASSWORD INVALID"). SQLSTATE=08001

su - db2smth and su - db2inst1 works fine which means that Linux authentication works fine.

How can I diagnose what’s wrong with the authentication?

Letsencrpyt components for MySQL? – Database Administrators Stack Exchange

So, these are the three pieces MySQL looks for in the .cnf file:

(mysqld)

ssl_ca=ca.pem

ssl_cert=server-cert.pem

ssl_key=server-key.pem

And I’m using Letsencrypt for SSL cert management. How do I transpose the certificates generated by Letsencrpyt (on Ubuntu) to these three pieces for MySQL? Specifically I’m struggling most with the CA certificate. Not seeing anything analogous in my letsencrypt certs.

Adding user in mysql – Database Administrators Stack Exchange

Why i’m not able to add user with privileges in mysql. i’m running command grant all privileges on *.* to 'username'@localhost identified by 'strong password';. i’m keep getting error ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'identified by 'strong password'' at line 1. I can add user separately and then assign him the privileges on db but i like to add user and assign privileges in single command. I’m running following mysql version

| version                  | 8.0.23-0ubuntu0.20.04.1       |
| version_comment          | (Ubuntu)                      |
| version_compile_machine  | x86_64                        |
| version_compile_os       | Linux                         |
| version_compile_zlib     | 1.2.11                        |

Index rebuild sql server – Database Administrators Stack Exchange


Your privacy


By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.