postgresql – PSQL incomplete backup: how to debug

I have been backing up my "screen" database.

after connecting to PSQL and writing l +

I am receiving (among other things):

Name | Owner | Coding | Check Ctype | Access privileges | Size | Table space | Description
------------------ + ------- + ---------- + ------------ - + ------------- + ------------------- + --------- + ---- -------- + ----------------------------------------- ---
screen | admin | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 36 GB | pg_default |

The size of my database is around 36 GB.

Now I usually make regular backups doing:

pg_dumb screen> screenbackup.bak

And the size of the output was always quite consistent with the size of my database.

But today I have a backup of only 8 GB and that seems very strange to me.

After restoring to a temporary database and making a query, some data seems to be missing.

The size of the restored data was 10GB.

Thanks, I never dropped the original in the first place, but there always seems to be a problem. I can not have pg_dump or pg_dumpall back up the entire database. The size is inconsistent when, in my experience, it is often at least the same size as the database or larger. Not 4 times smaller ..

Do you have any idea where to go to know where the problem is coming from?

Editing: maybe it's important: I'm in PSQL 9.4.4.1 and trying to make the backup so I can update psql as well as just save my data.

postgresql – Why are the psql commands in my script suddenly killed by jenkins / hudson?

I have an existing jenkins job that starts a shell script to copy my prod environment into qa.

We added a lot of data to prod (gzip dump went from 2gig to 15gig) and, suddenly, my jobs in jenkins started to fail.

We are running postgres 9.5 in aws and jenkins 2,171. all jenkins jobs are executed in the master, which is the same server with 6 executors. No memory problems / cpu / disk space

I tried some things: statement_timeout In the instance of Postgres it is already 0. Going from bash to sh for some reason helped in some scripts but not in others. In particular, it still has several psql statements removed. The script works well when it is executed from an interactive shell.

We also tried disabling Process Tree Killer https://wiki.jenkins.io/display/JENKINS/ProcessTreeKiller. do not go.

Here is the code of two of the most innocuous commands that should be executed fairly quickly. $ POSTGRES_HOST_OPTS It only has the name of the database and the port:

echo -e "Executing the POSTGIS command"
psql $ POSTGRES_HOST_OPTS -U $ POSTGRES_ENV_POSTGRES_USER_PROD -d postgres -c "CREATE EXTENSION postgis;"

echo -e "Creating the temporary user dv3_qa_tmp so that we can change the name of $ POSTGRES_ENV_POSTGRES_USER_PROD user  n"
psql $ POSTGRES_HOST_OPTS -U $ POSTGRES_ENV_POSTGRES_USER_PROD -d postgres -c "create role dv3_qa_tmp password & # 39; $ PGPASSWORD_QA & # 39; createdb createrole heredit login;"

Here is the jenkins console output:

Waiting for the new instance to be available ...
-e Renaming the dv3_prod database to dv3_qa

Delicate
-e Executing the POSTGIS command
Delicate
-e Creating the temporary user dv3_qa_tmp so that we can change the user name dv3_prod_user

Delicate
-e Rename the user dv3_prod_user to dv3_qa_user

Delicate
Delicate
-my
All ready

From the jenkins.log file there is something in the file descriptors, but I'm not sure how that relates. I have also tried to redirect stderr that removes this message but does not prevent the commands from being killed.

April 10, 2019 16:23:31 hudson.Proc $ LocalProc join
WARNING: Process the filtered file descriptors. See https://jenkins.io/redirect/troubleshooting/process-leaked-file-descriptors for more information.
java.lang.Exception
in hudson.Proc $ LocalProc.join (Proc.java:334)
in hudson.tasks.CommandInterpreter.join (CommandInterpreter.java:155)
in hudson.tasks.CommandInterpreter.perform (CommandInterpreter.java:109)
in hudson.tasks.CommandInterpreter.perform (CommandInterpreter.java:66)
in hudson.tasks.BuildStepMonitor $ 1.perform (BuildStepMonitor.java:20)
in hudson.model.AbstractBuild $ AbstractBuildExecution.perform (AbstractBuild.java:741)
in hudson.model.Build $ BuildExecution.build (Build.java:206)
in hudson.model.Build $ BuildExecution.doRun (Build.java:163)
in hudson.model.AbstractBuild $ AbstractBuildExecution.run (AbstractBuild.java:504)
in hudson.model.Run.execute (Run.java:1818)
in hudson.model.FreeStyleBuild.run (FreeStyleBuild.java:43)
in hudson.model.ResourceController.execute (ResourceController.java:97)
in hudson.model.Executor.run (Executor.java:429)

How psql to a specific PG server

There are PG 9.5 and 11 installed and I would like psql Command line to the server 11. How can I do that? may l psql to the command line but there are only tables from the 9.5 server available. I am also of use pgadmin 4.

postgresql – Table psql Description

in psql if I do d +, I get something like

    Scheme | Name | Type | Owner | Size | Description
-------- + ------------------------- + ---------- + ---- ---- + ------------ + -------------
Norsed Routing Result | table | morten 16 kB |
norsar Routing Result_id_seq | sequence | morten 8192 bytes | 

and so. The last field, "Description" is always blank. Where does that field read its content? – and therefore, how can I establish a description in a table? In some cases, that could be quite useful, the same for the description of the columns that are shown if I do d +.

I tried set ECHO_HIDDEN ON, but that did not make it much clearer.

postgresql: increasing the working memory in psql does not improve the performance of the queries

I am very new to database administration and could use any advice from more experienced PostgreSQL users

My database is a db.m4.large of AWS RDS whose specifications are here

My queries in one of the largest tables in the database (330 million rows) are running extremely slowly and / or waiting time, even when I have a query like

SELECT (value DISTINCT) FROM mytable; 

where there are only 4 or 5 unique values ​​in the value column. It is not indexed.

pg_size_pretty (pg_total_relation_size (relid)) Shows the size of the table to be. 118 GB

Even when configured work_mem to 240 MB, which should allow psql to do larger classes in memory, the query is equally slow

EXPLAIN ANALYZE SELECT (DISTINCT value) FROM mytable;
time is also running out

What does one do to optimize the use of memory to consult in a sufficiently large database? What pg configurations do I need to monitor and / or configure differently?

Any help would be appreciated.

postgresql – PSQL: creates random datasets with the same statistical distribution as another table

For testing purposes, I want to create data sets (tables) where the statistical distribution of the values ​​in the columns is the same as that of my real data, but random.

Example: consider the following table (& # 39;ARTICLES& # 39; for a clothing store):

| id | type | color | size | price |
----------------------------------------
| 1 | shirt | yellow | M | 14.99 |
| 2 | brown pants | L | 20.00 |
| 3 | shirt | red | L | 14.99 |
| 4 | red pants | L | 20.00 |
| 5 | cap | yellow | M | 5.00 |
| 6 | cap | brown | S | 5.00 |
| 7 | shirt | red | M | 14.99 |
| 8 | red pants | L | 20.00 |

So, of these 8 rows, I have the following distributions for each column:

  • Type: 3 shirts, 3 pants and 2 cap.
  • Color: 4 red, 2 yellow, 2 brown.
  • Size: 1 S, 3 M, 4 L
  • Price: 2 x 5.00, 3 x 14.99, 3 x 20.00

What I want is to create another table, with the same columns and the same number of rows, in which I have for each separate column the same statistical distribution of values, but assigned randomly and independently of one another.

Making the statistics for each column is quite easy:

SELECT column_name, COUNT (1) AS number_of_sunks
OF ARTICLES GROUP BY column_name;

I can also easily record these statistics in a dedicated table:

CREATE TABLE articles_stats (column_name varchar (255), value varchar (255), number_of_rows integer);

INSERT INTO articles_stats (column_name, value, number_of_rows)
SELECT & # 39; type & # 39 ;, write, ACCOUNT (1) AS number_of_sucks
OF ARTICLES GROUP BY TYPE
UNION
SELECT & # 39; color & # 39 ;, color, COUNT (1) AS number_of_rows
OF ARTICLES GROUP BY COLOR.
UNION
SELECT & # 39; size & # 39 ;, size, ACCOUNT (1) AS number_of_shoes
OF ARTICLES GROUP BY SIZE
UNION
SELECT & # 39; price & # 39 ;, price :: varchar (255), COUNT (1) AS number_of_rows
OF ARTICLES GROUP BY PRICE;

In this example, I would create the following articles_stats table :

| column_name | value | number_of_rows |
-------------------------------------------
| type | shirt | 3 |
| type | 3 pants |
| type | cap | 2 |
| color | red | 4 |
| color | yellow | 2 |
| color | brown | 2 |
| size | S | 1 |
| size | M | 3 |
| size | L | 4 |
| price | 5.00 | 2 |
| price | 14.99 | 3 |
| price | 20.00 | 3 |

But how can I create inserts in the target table (let's call it & RANDOM_ARTICLES & # 39;)?

PS: I have to do this many times, so I hope to create a PSQL function for this.

postgresql – PSQL: creates random datasets with the same statistical distribution as another table

For testing purposes, I want to create data sets (tables) where the statistical distribution of the values ​​in the columns is the same as that of my real data, but random.

Example: consider the following table (& # 39;ARTICLES& # 39; for a clothing store):

| id | type | color | size | price |
----------------------------------------
| 1 | shirt | yellow | M | 14.99 |
| 2 | brown pants | L | 20.00 |
| 3 | shirt | red | L | 14.99 |
| 4 | red pants | L | 20.00 |
| 5 | cap | yellow | M | 5.00 |
| 6 | cap | brown | S | 5.00 |
| 7 | shirt | red | M | 14.99 |
| 8 | red pants | L | 20.00 |

So, of these 8 rows, I have the following distributions for each column:

  • Type: 3 shirts, 3 pants and 2 cap.
  • Color: 4 red, 2 yellow, 2 brown.
  • Size: 1 S, 3 M, 4 L
  • Price: 2 x 5.00, 3 x 14.99, 3 x 20.00

What I want is to create another table, with the same columns and the same number of rows, in which I have for each separate column the same statistical distribution of values, but assigned randomly and independently of one another.

Making the statistics for each column is quite easy:

SELECT column_name, COUNT (1) AS number_of_sunks
OF ARTICLES GROUP BY column_name;

I can also easily record these statistics in a dedicated table:

CREATE TABLE articles_stats (column_name varchar (255), value varchar (255), number_of_rows integer);

INSERT INTO articles_stats (column_name, value, number_of_rows)
SELECT & # 39; type & # 39 ;, write, ACCOUNT (1) AS number_of_sucks
OF ARTICLES GROUP BY TYPE
UNION
SELECT & # 39; color & # 39 ;, color, COUNT (1) AS number_of_rows
OF ARTICLES GROUP BY COLOR.
UNION
SELECT & # 39; size & # 39 ;, size, ACCOUNT (1) AS number_of_shoes
OF ARTICLES GROUP BY SIZE
UNION
SELECT & # 39; price & # 39 ;, price :: varchar (255), COUNT (1) AS number_of_rows
OF ARTICLES GROUP BY PRICE;

In this example, I would create the following articles_stats table :

| column_name | value | number_of_rows |
-------------------------------------------
| type | shirt | 3 |
| type | 3 pants |
| type | cap | 2 |
| color | red | 4 |
| color | yellow | 2 |
| color | brown | 2 |
| size | S | 1 |
| size | M | 3 |
| size | L | 4 |
| price | 5.00 | 2 |
| price | 14.99 | 3 |
| price | 20.00 | 3 |

But how can I create inserts in the target table (let's call it & RANDOM_ARTICLES & # 39;)?

PS: I have to do this many times, so I hope to create a PSQL function for this.

Get interactive psql shell after running queries

I want to execute a set of commands before executing other queries using psql

The following is what I want to execute.

psql << EOF
 setenv PAGER & # 39; pspg -s 0 & # 39;
 pset pager always
EOF

After executing this, I want to have an interactive shell with these parameters set.

I am looking for how to get interactive shell after this.