postgresql – Must an index cover all columns to be used in a merge?

PostgreSQL can perform an index scan in both cases, but it prefers a sequential scan and a sort in the first case, because it thinks that will be faster. In the second case, an index-only scan is possible and faster.

You could of course add all table columns to the index, but that is usually unreasonable and may even exceed the size limit for an index entry.

If you have reason to believe that PostgreSQL is doing the wrong thing, you can try if the query is faster with enable_seqscan set to off. If yes, perhaps you have to adjust random_page_cost to match your hardware.

seo – Why does Google Index Http Urls?

I can’t think of any reason besides these. It could also be bug on Google’s end.

I’m assuming we’ve already ruled out automatic redirection per Max’s question and your answer.

Directory & File Permissions are Not Set Properly

If your directory/file permissions are set up incorrectly Google it is possible that your whole file structure could be indexed. The redirects could be perfectly fine and this could happen if certain files/directories are accessible.

Permissions that would allow this would also present a large security vulnerability. For proof that this can happen please see this Google Support answer.

Do a site: search like MrWhite asked – if this is the case, it will be very obvious.

Also as MrWhite asked – make sure you’ve set up the property in Search Console. I recommend using the Domain Property option.


I use this command a lot after freshly installing WordPress on Ubuntu + Nginx/Apache.

chown -R www-data: /var/www/html
find /var/www/html -type d -exec chmod 750 {} ;
find /var/www/html -type f -exec chmod 640 {} ;

As root of course.


Ensure Canonical URLs are not HTTP

Within Search Console, use the URL inspection tool to check what your User Declared Canonical is. This it will show the status of the page when Google last crawled it.

If you see http update your canonical URLs to use
https. If there is no User Declared Canonical, it means that you don’t have them set – which you should then change, and go back to Search Console and request indexing to your root domain.

sql server – How to estimate the nonclustered index size before its creation?

I created two scenarios and expected to see the same result but the results were noticeably different. I would like to know where I did wrong.

Scenario 1

Create table mytable1 with clustered index and nonclustered index on [OnlineSalesKey] and [DateKey] columns correspondingly. The total number of rows I inserted: 12627608 rows. Pay attention that [OnlineSalesKey] data type is INT:

enter image description here

create clustered index clustered_index_1 on mytable1

create nonclustered index nonclustered_index_1 on mytable1

enter image description here

Scenario 2

Create table mytable1 with clustered index and nonclustered index on [OnlineSalesKey] and [DateKey] columns correspondingly. The total number of rows I inserted: 12627608 rows. Pay attention that [OnlineSalesKey] data type is INT:

enter image description here

create clustered index clustered_index_1_2 on mytable1.2

create nonclustered index nonclustered_index_1_2 on mytable1.2

enter image description here

Expected result:

both nonclustered indexes should have the same size because clustered index column’s data type is INT, so its size is 4 bytes.

Actual result:

enter image description here

Could you please explain why these sizes are so different?

directory index – Best form for URLs: file-name.html or /file-name/

Running a Nibbler analysis on a static website I just built, I got the following feedback:

Avoid use of file extensions wherever possible. File extensions appear at the end of web addresses, and have several negative effects. They make the address harder to remember or type (particularly for non-technical users), and can reveal the underlying technology of the website making it very slightly more vulnerable to hackers. They also tie the implementation of the website to a specific technology, which can make subsequent migration of URLs difficult.

The above message is a result of having a flat directory structure and linking directly to the individual web pages (whatever.html). So is it really that much better to put every html file on a website in its own personal subdirectory (giving every webpage the file name index.html and relying on the directory name to identify the file) vs simply linking directly to the individual html files? I did some searching but didn’t really find anything useful. This discussion had some info but didn’t answer my question.

I’m curious to know what folks think of the two different approaches. Thanks.

microservices – How to synchronize database and elasticsearch index

Say we have some SQL database wrapped in a microservice A, and some ElasticSearch index wrapped in a microservice B. A keeps lots of data and is slow to search, so B is used as a search database for external users.

Product wants an API endpoint that adds/updates data to A, but ensures that B’s index is updated with the same data right away. I.e. when API returns 200, the new data is present in A and searchable in B.

So we can do a sequence like (1) receive the request (2) update A (3) update B (4) return 200.

But if updating B fails, we have new data in A but no/outdated data in B.

I guess this is a common problem, and there are related answers which outline general ideas such as this one microservices-handling-eventual-consistency but how to apply the thinking to this particular use case?

  • Is there a pattern or known similar solution?
  • Can it be solved in a
    good way at all, or should we convince product to accept eventual
    consistency instead?

magento2.4 – Catalog Search index error After updating Magento 2.4.0 to 2.4.2

When I run php bin/magento indexer:reindex I get the following error message from the Catalog Search indexer:

{"error":{"root_cause":({"type":"mapper_parsing_exception","reason":"analyzer (sku) not found for field (sku)"}),"type":"mapper_parsing_exception","reason":"analyzer (sku) not found for field (sku)"},"status":400}

I have checked that elasticSearch 7 is connected and running. I have disabled a search autocomplete plugin (https://www.mageworx.com/magento-2-search-autocomplete-free.html) and that also did not resolve the issue.

I noticed some others had issues but mainly due to a mirasvit plugin which I do not have installed.

fragmentation – Running Index defragmentation with SQL Job (looks like every night!)

After running sp_blitzcache (over time) I have found that everyday my plan cache is compiling new execution plans. 100% of my plan cache is built within the last 24 hours (each day).

I think these index rebuilds/reorganizes are causing this. I have found this job running every single night (I will switch it to once a week).

My question is; with these variables being commented out and not being defined in this script, won’t this run on every index without any parameters? I think the previous DBA’s intentions were to run the rebuilds and reorganizes when fragmentation was over 5% (for reorganize) and over 30% (for rebuilds), and I think this is just running it without parameters, is that the case?

–sqlcmd -E -S $(ESCAPE_SQUOTE(SRVR)) -d master -Q “EXECUTE (dbo).(IndexOptimize) @databases = ‘USER_DATABASES’, @FragmentationLow = NULL,

–@FragmentationMedium = ‘INDEX_REORGANIZE,INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE’, @FragmentationHigh = ‘INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE’,

–@FragmentationLevel1 = 5,@FragmentationLevel2 = 30,@LogToTable = ‘Y'” -b

USE master

GO

EXECUTE dbo.IndexOptimize

@databases = ‘USER_DATABASES’, @FragmentationLow = NULL,

@FragmentationMedium = ‘INDEX_REORGANIZE,INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE’,@FragmentationHigh = ‘INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE’,

@FragmentationLevel1 = 5,@FragmentationLevel2 = 30,@LogToTable = ‘Y’

sql server – Ola Hallengren Maint Solution: How to configure Index Optimize and Integrity Check for AO Group

I’m trying to configure both index optimization and integrity check for databases that are part of AO group. The documentation says that those jobs should be enabled on all servers that are part of the AO and the script should be identical.

Will this configuration be enough?

EXECUTE [dbo].[DatabaseIntegrityCheck]
@Databases = 'AVAILABILITY_GROUP_DATABASES',
@LogToTable = 'Y'

What about job schedule, should all the jobs run at the same time on all servers?

dplyr – How to make a copy of every row and column in one table for every index of another table in R?

There are two dataframes, one with an index and another with no index. I want to make a new dataframe with the indices of the first and the rows and columns of the other in such a way that there is a copy of every data in the second table for each index.

df_A <- data.frame("index" = c("id1","id2","id3")
                   , variable_a = c(1,2,3)
                   , variable_b = c("x","f","d"))

df_B <- data.frame(variable_x = c("4124","414","123")
                   , variable_y = c(12,22,13)
                   , variable_z = c("q","w","d"))

The result should be:

df_C <- data.frame("index" = c("id1","id1","id1","id2","id2","id2","id3","id3","id3")
                   , variable_x = c("4124","414","123","4124","414","123","4124","414","123")
                   , variable_y = c(12,22,13,12,22,13,12,22,13)
                   , variable_z = c("q","w","d","q","w","d","q","w","d"))