I’m using Dedicated SQL-Pools AKA Azure Synapse (which is different from serverless/on-demand Synapse that comes with Azure Synapse Analytics). And According to Azure:
While the syntax of partitioning may be slightly different from SQL Server, the basic concepts are the same.
- I have a table that receives ~33 million new rows per day.
- Column named
versionindicates the day on which that row arrived. E.g. all 33 million rows arriving on
- Currently it has ~300 million rows (9 days worth of data).
- Table has daily partitions.
Here is the order of things I did:
- create table with 345 daily partitions (from 22-May-2021 to 30-Apr-2022).
- insert 9 days (33 million x 9 = ~300 million rows) worth of data into this table. I used
INSERT INTO mytable SELECT * FROM some_other_table_with_300_million_rows)
- update statistics mytable
- Expectation: I have 9 partitions with ~33 million rows and 336 partitions with 0 rows.
- Reality: All partitions have equal amount of data.
When I run this SQL statement:
SELECT partition_id, index_id, partition_number, rows, data_compression_desc FROM sys.partitions WHERE object_id = OBJECT_ID('mytable')
I get this result:
partition_id index_id partition_number rows data_compression_desc 72057597508452352 1 1 868784 COLUMNSTORE 72057597508583424 1 2 868784 COLUMNSTORE ...... 72057597553410048 1 344 868784 COLUMNSTORE 72057597553541120 1 345 868812 COLUMNSTORE
868784 * 345 = ~300 million
This SQL statement however:
SELECT * FROM sys.dm_pdw_nodes_db_partition_stats WHERE object_id = OBJECT_ID('mytable')
returns no rows.
Originally posted here until I discovered dba community: https://stackoverflow.com/questions/67824876/getting-insight-into-partitions-of-a-table-in-azure-synapse-similar-to-sql-serv