As a pet project I am building a database, while I am used to thinking of data in terms of table, collections, rows, and documents, I have been learning about the actual storage data structures that are used to persist data to disk (we can include memory here too even though it doesn’t have the same durability).
I’m wondering what the relationship between a table (such as what someone interacts with when making queries) and the data structures used for persistence (such as B-Trees or LSM Trees).
Some thoughts I had as to how this might be solved:
- It could be a matter of which files they’re physically stored in when persistence takes place (Not sure this one makes the most sense considering how data is written to disk in blocks and pages)
- It could be a logical separation where the storage engine adds some meta data indicating which tables a given records belongs to? Or maybe a hybrid of both
- Stemming off number two, maybe it comes down to the database’s internally managed primary index that handles this since reads and searching starts there anyway
I’m really just getting started with this project and learning about all this so it is very possible that I’m just flat out not thinking about it correctly. Let me know!