performance – MySQL SELECT slow, but only 2 x 300K rows and indexes

Have the MySQL SELECT query below that is awfully slow.

It takes ~1.0 seconds to execute despite have only 300K rows and indexes, so I would love to find a way to get it to execute faster since it’s a query that needs to be run again and again.

The query:

SELECT p.id, p.image, c.name, s.name, MIN(p.saleprice)
FROM products p 
JOIN shops s ON p.shopid = s.id 
JOIN products_category pc ON p.id = pc.product_id 
JOIN categories c ON pc.category_id = c.id
WHERE brand_id > 0
AND pc.category_id = 46
AND pc.active = 1
AND p.price > 0
AND p.saleprice > 0
AND p.saleprice < p.price
AND (last_seen > DATE_SUB(NOW(), INTERVAL 2 DAY))
GROUP BY p.image

The query returns 960 rows.

The table products has 300.000 rows and these columns:

id (int, primary key)
name (varchar 512)
image (varchar 512)
price (int)
saleprice (int)
added (datetime)
last_seen (datetime)

It has one index across multiple columns in this order:

brand_id (int), shopid (int), last_seen (datetime), price (int), saleprice (int)

The table products_categories also has 300.000 rows and these columns:

id (int, primary key)
product_id (int)
category_id (int)
active (int)

It has two indexes across multiple columns:

category_id (int), active (int)
product_id (int), active (int)

Based on similar questions here, I have tried nesting things with an inner select:

SELECT p.id, p.image, c.name, s.name, MIN(p.saleprice)
FROM 
(SELECT * FROM products WHERE brand_id > 0 AND price > 0 AND saleprice > 0 AND saleprice < price AND (last_seen > DATE_SUB(NOW(), INTERVAL 3 DAY))) p 
JOIN shops s ON p.shopid = s.id 
JOIN products_category pc ON p.id = pc.product_id 
JOIN categories c ON pc.category_id = c.id 
WHERE pc.category_id = 46
AND pc.active = 1
GROUP BY p.image

It didn’t help. The version with the inner select takes ~1,3 seconds to execute.

The problem seems to be the join between products and products_category, i.e. the two big tables with 300K rows each.

Maybe there’s a trick I can do with my indexes? Or can any of you spot something else I should optimize?

EXPLAIN of the query:

id  select_type table   partitions  type    possible_keys                   key             key_len ref             rows    filtered    Extra
1   SIMPLE      c       N          const   PRIMARY                         PRIMARY         4       const           1       100.00      Using temporary; Using filesort
1   SIMPLE      pc      N          ref     category_id etc,product_id etc  category_id etc 10      const,const     43104   100.00      Using where
1   SIMPLE      p       N          eq_ref  PRIMARY,brand_id etc            PRIMARY         4       pc.product_id   1       5.00        Using where
1   SIMPLE      s       N          eq_ref  PRIMARY                         PRIMARY         4       p.shopid        1       100.00      N

google sheets – Help me separate each student into separate rows without inputting the parent data each time

I am collected data about parents and students. I need there to be a separate row for each student without entering the family data each time. I have always used the “pre-filled link” option to do this but need a less tedious way and want to use a formula to do this on my response sheet. I have found some resources online but I can not get any of the examples to work for me. Here is a copy of the response. I would like each row to include data from A-L (header data?) and make a new row for each student M-T, U-AB, AC-AJ, AK-AR. This is an example but the actual sheet will have more columns.
This has 2 tabs, how I my responses are coming out and the second tab is how I want it to look. Any help is greatly appreciated.
https://docs.google.com/spreadsheets/d/1QlAk7qXLU5SZJSOHVY4QMCHjeFYdK1-4GrvDjpBMVmk/edit?usp=sharing

Is there a way to copy a selection spanning multiple rows, and paste them as merged cells spanning two rows each, in Google Sheets?

I’m going to be honest; my biggest issue is describing what I wish to accomplish. I can’t find the right word for it, so the title might not make a lot of sense. But the pictures should be clear,

I want to take this sheet:

begin

Perform some operation, and end up with this:

end

Currently this takes a lot of effort, particularly for large amounts of values. I first have to move each row down to get white rows between each row with values, and then merge them individually. Takes a lot of clicks, and I do this semi-regularly. If there is an extension that does this, or a way to do this less laboriously, I would be very happy.

Google Sheets chart only shows first ~250 rows?

I have a stacked area chart with an x-axis from B8:B350 and two series from C8:C350 and D8:D350 (so data range B8:D350) but the data only displays the first ~250 rows which is September to May. Any idea why it would do this, and what I could do to get the rest of my data to display?

chart and data range

google sheets – How do I convert rows with 45 columns into 15 separate rows, each with 3 columns?

I am working with a google sheet that gets data from an email parser. Each time an email comes in, a single row is created, and fills in these columns:

B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, AA, AB, AC, AD, AE, AF, AG, AH, AI, AJ, AK, AL, AM, AN, AO, AP, AQ.

I’d like to have it output as:

B1 | C1 |D1

D1 | E1 |F1
....

AO1|AP1|AQ1

B2 | C2|D2
....

Is this possible?
I’ve tried using this:

=FILTER({Sheet1!B:B,Sheet1!C:C,Sheet1!D:D;Sheet1!E:E,Sheet1!F:F,Sheet1!G:G;Sheet1!H:H,Sheet1!I:I,Sheet1!J:J;Sheet1!K:K,Sheet1!L:L,Sheet1!M:M;Sheet1!N:N,Sheet1!O:O,Sheet1!P:P;Sheet1!Q:Q,Sheet1!R:R,Sheet1!S:S;Sheet1!T:T,Sheet1!U:U,Sheet1!V:V;Sheet1!W:W,Sheet1!X:X,Sheet1!Y:Y;Sheet1!Z:Z,Sheet1!AA:AA,Sheet1!AB:AB;Sheet1!AC:AC,Sheet1!AD:AD,Sheet1!AE:AE;Sheet1!AF:AF,Sheet1!AG:AG,Sheet1!AH:AH;Sheet1!AI:AI,Sheet1!AJ:AJ,Sheet1!AK:AK;Sheet1!AL:AL,Sheet1!AM:AM,Sheet1!AN:AN;Sheet1!AO:AO,Sheet1!AP:AP,Sheet1!AQ:AQ},LEN(Sheet1!A:A,Sheet1!A:A,Sheet1!A:A))

But it was just based on another answer I saw on here, and I am sure that I am not applying the filter(range,len()) part correctly.

MySQL convert rows to column

I am working with 2 tables and need help to produce an output by converting rows to columns, and i need to sum the value first be grouping

Here is the fiddle: https://www.db-fiddle.com/f/kmQjRvvensRTfYsSELxMF2/1

Here is the table:

CREATE TABLE teacher (
TeacherId INT, BranchId VARCHAR(5));
INSERT INTO teacher VALUES
("1121","A"),
("1132","A"),
("1141","A"),
("2120","B"), 
("2122","B"),
("2123","B");
                               
CREATE TABLE activities (
ID INT, TeacherID INT,    Hours   INT);

INSERT INTO activities VALUES
(1,1121,2),
(2,1121,1),
(3,1132,1),
(4,1141,NULL),
(5,2120,NULL),
(6,2122,NULL),
(7,2123,2),
(7,2123,2);

My SQL:

    SELECT totalhours hours
             , branchid
             , COUNT(*) total
          FROM 
             ( SELECT COALESCE(y.hr,0) totalhours
                    , x.branchid
                    , x.teacherid
                 FROM teacher x
                 JOIN 
                    ( SELECT teacherid
                           , SUM(hours) hr 
                        FROM activities
                       GROUP 
                          BY teacherid
                       ORDER 
                          BY hr ASC
                    ) y
                   ON x.teacherid = y.teacherid
             ) a
         GROUP 
            BY hours
             , branchid
         ORDER
            BY hours
             , branchid;

Output:

   +---------------+-------------------+--------------------+
   |     hours     |     branchid      |       total        |
   +---------------+-------------------+--------------------+
   |       0       |        A          |         1          |
   |       0       |        B          |         2          |
   |       1       |        A          |         1          |
   |       3       |        A          |         1          |
   |       4       |        B          |         1          |
   +---------------+-------------------+--------------------+

Explanation:

Table teacher consist teacher id and branch id, while table activities consist of id, foreign key teacher id, and hours. Hours indicate duration of each activities made by teacher. Teacher can do more than one activities or may not do any activities. Teachers who not doing any activity will be set to null.

The objective of queries is to produce a table that consist of summary of teachers activity by branch and group by hours.

In the expected output table, ‘Hours’ is a fixed value to indicate hours from in ascending order starting from 0 to 12. It will still display value even there are no hours value for A and B. A and B columns are branch. The value indicates total number of teachers who are doing activities. So, for row 0, there are 1 teacher for branch A and 2 teachers for branch B who are not doing activities.

Expected output:

   +-----------+------------+------------+
   |   Hours   |     A      |     B      |
   +-----------+------------+------------+
   |     0     |     1      |     2      |
   |     1     |     1      |     0      |
   |     2     |     0      |     0      |
   |     3     |     1      |     0      |
   |     4     |     0      |     1      |
   +-----------+------------+------------+

Google Sheets formula to find rows with matching values, looking up in multiple columns

What would be the Google Sheets formula to search for a matching value in a range that goes across multiple rows an columns? For example I need to search the entire range H:P (all rows and columns) and find the cells with a matching value, if any. Ultimately in this case I need just a list of the row numbers where a matching cell is found. In the screenshot there are two matches highlighted in green. There is a match on O2, and on M3. So in this case I need a result like “2,3”.

I have tried various things for several hours with no luck. Most examples of formulas that I could find and understand are about looking up in either a single column, or row.

Any help appreciated! Thank you!

range to be searched

sql server – Combine Rows with indirect relation

I am trying to create a report from a cloud based EHR so I cannot share real data and some of these tables are fairly massive. I will try to minimize and share the bare minimum and expand if someone needs more information to help. This should be fairly easy and I’m just having a brain fart I think. I need to combine multiple answers into a single row as separate columns.

Here is my query as it is and it does return all the answers but every answer is generating a separate row. There will only ever be one answer for each question per visit id.

There are a few catches to working with this system. At it’s heart it’s SQLServer, however queries are restricted to starting with ‘select’ making temp tables a bit more difficult. There can be no spaces, no blank lines nothing before your select. This is their version of security I guess. All reports are written through a web interface no direct access to the db in any way.

Current Output:

clientvisit_id  |  client_id  |  members_present  |  patient_category

141001          |  2001       |                   |
141001          |  2001       |                   |      
141001          |  2001       |  Patient          |      
141001          |  2001       |                   |  Adult       

Desired output:

clientvisit_id  |  client_id  |  members_present  |  patient_category

141001          |  2001       |  Patient          |  Adult   
Select 
  cv.clientvisit_id,
  cv.client_id,
  mp.answer as members_present,
  pc.answer as patient_category

From ClientVisit cv
  Inner Join SavedVisitAnswer sva On sva.clientvisit_id = cv.clientvisit_id
  Inner Join Question q On sva.question_id = q.question_id
  Inner Join Category cat On q.category_id = cat.category_id
  Inner Join FormVersion fv On cat.form_ver_id = fv.form_ver_id
  Inner Join Forms On fv.form_id = Forms.form_id
  Inner Join (Select
     a1.answer_id,
     a1.answer
     From Answer a1
     Where a1.question_id = '532096'
  ) as pc on sva.answer_id = pc.answer_id 
  Inner Join  (Select
     a2.answer,
     a2.answer_id
     From Answer a2
     Where a2.question_id =  '532093'
  ) as mp on sva.answer_id = mp.answer_id


Where 
  Forms.form_id = '246'

sql server – Recommendations about deleting large set of rows in MSSQL

I need to delete about 75+ million rows from a table everyday that contains around 3.5 billions of record.

Database recovery mode is simple, I have writen a code that deletes 15.000 rows in a while condition until all 75M records is deleted. (i use batch delete due to log file grow) However, with current deletion speed it looks like it will take at least 5 days, which means that amount of data required to be deleted is multiply faster than my deletion speed.

Basically what i’m trying to do is summarizing (in another table) and deleting data older than 2 months. There is no update operation in that table, only insert and delete.

I have an enterprise edition of MSSQL 2017

Any suggestions will be welcome.

optimization – Bounding 0-1 matrix with k unique rows

Problem Statement:
Suppose that I have a $0-1$ matrix $A$ (all of the entries are $0$ or $1$). I wish to find the tightest upper bound with $k$ many unique rows. To be more precise, let S denote the set of $0-1$ matrices $B$ such that it only has $k$ unique rows, $A_{ij} leq B_{ij}$ for all $i$ and $j$.
Find
$$min_{B in S} ||A – B||$$

Example:
Suppose $k = 2$ and
$$A = begin{bmatrix}
1 & 0 & 0 & 0 & 0\
1 & 1 & 0 & 0 & 0\
0 & 0 & 1 & 0 & 1\
0 & 0 & 1 & 1 & 0\
end{bmatrix}$$

Then the optimal matrix $B$ is
$$A = begin{bmatrix}
1 & 1 & 0 & 0 & 0\
1 & 1 & 0 & 0 & 0\
0 & 0 & 1 & 1 & 1\
0 & 0 & 1 & 1 & 1\
end{bmatrix}$$

Since $B$ only has $2$ distinct rows, $A leq B$, and $||A – B|| = 3$ is minimized.

Question 1:
This problem reminds of the minimum $k$-union, set-union, and other NP-complete problems. Is this problem an NP-complete optimization problem?

Question 2:
Is there an efficient way to obtain approximately optimal matrix $B in S$? Instead of minimizing $||A – B||$, can we get close to the minimum possible value?

So far, I have tried to cluster each row of matrix $A$ using k-means. Then within each cluster $i$, I tried to construct a vector $v_i$. Where $j^{th}$ entry of $v_i$ is $1$ if at least p-percent of the vectors in cluster $i$ has $j^{th}$ entry to be $1$. The vectors $v_i$ served as initial potential guess for possible rows of the matrix $B$. Then I used greedy algorithm. This has decent performance, but it’s not great.