This is exported from my Logseq graph, so it looks sorta weird. I'll attempt to update this as my notes get updated 🙂
The knowledge graph equivalent can be found here.
-
Links
-
IO
- What every programmer should know about solid state drives
-
Storage Algorithms
- Write-Optimized Dynamic Hashing for Persistent Memory
- https://www.usenix.org/system/files/fast19-nam.pdf
- @Write-Optimized Dynamic Hashing for Persistent Memory
- CCEH
- MehDB uses an implementation of this
- Log Structured Merge Trees
- https://www.cs.umb.edu/~poneil/lsmtree.pdf
- https://github.com/facebook/rocksdb/wiki/How-we-keep-track-of-live-SST-files
- The entire "implementation details" section is good
- BW Trees
- https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/bw-tree-icde2013-final.pdf
- Developed by Microsoft
- Binary Tree derivative
- Hitchhiker Tree
- https://github.com/datacrypt-project/hitchhiker-tree
- https://www.youtube.com/watch?v=jdn617M3-P4
- Binary Tree derivative
- Skip List
- https://brilliant.org/wiki/skip-lists/
- Write-Optimized Dynamic Hashing for Persistent Memory
-
Hashing/Hashmaps
- Extendible hashing
- https://www.geeksforgeeks.org/extendible-hashing-dynamic-approach-to-dbms/
- https://en.wikipedia.org/wiki/Extendible_hashing
- EVMap
- https://www.youtube.com/watch?v=eLNAMEoKAAc
- Extendible hashing
-
Compression
- ZSTD
- https://facebook.github.io/zstd/
- ZSTD
-
Replication
- View-stamped Replication
- VSR
- https://pmg.csail.mit.edu/papers/vr-revisited.pdf
- https://pmg.csail.mit.edu/papers/vr.pdf
- Original paper
- Raft
- https://raft.github.io/
- Paxos
- https://lamport.azurewebsites.net/pubs/paxos-simple.pdf
- https://lamport.azurewebsites.net/pubs/lamport-paxos.pdf
- View-stamped Replication
-
Practical Examples
- GFS
- https://www.micahlerner.com/2020/03/22/understanding-googles-file-system.html
- Pinecone
- Vector database
- https://www.pinecone.io/learn/rust-rewrite/
- GFS
-
-
Notes
-
Hard Drives and Solid State Drives
- Align to 4Kb page size
- Deletes don't happen when you think they happen
- Hard drives are very bad at random access
- Spacial locality + reduced # of total reads for a query are very important
- SSDs are much better at random reads
-
File Systems
- The (user's) choice of file system can have a significant impact on the performance of a database
- Copy-on-write file systems (such as ZFS) increase write-amplification and are not generally well suited to hosting databases that update files in-place, though this of course depends
-
Record storage
- For datasets that are of fixed, consistent size it's usually best to use a row-like approach
- Indexing is easy, as the offset is
size of record
* row ID + any header offset- Indexes become <column you're indexing> + row ID (unless a covering index)
- Deletes may or may not be difficult
- Requires a compaction step otherwise you store the record indefinitely, which might be ok
- Most relational databases use rows as their primary record storage mechanism with various mechanics for handling variable sized data-structures
- Indexing is easy, as the offset is
- Columnar databases are similar to row databases, but benefit from better storage locality when use cases involve longitudinal reads
- Great for large analytical engines, but suffers in performance when reading all columns associated with a particular entity/row due to "random" disk read
- Indexing is possibly harder, but should be roughly the same as row-based
- Probably want to store each column as a separate file to facilitate easy appending of new records
- Key-value stores are purely associative data-structures
- Typically do not care about record byte alignment
- Commonly implementations use Log Structured Merge Trees and hash maps
- Fact or datalog databases... are weird
- Apparently were more popular at some point, but have largely fallen out of style (unless you're a Clojure developer)
- Few implementations
- Mozilla had one, abandoned
- TODO get link for this
- Datomic is most prominent modern-ish example
- Mozilla had one, abandoned
- Records stored as tuples with indexed values
- For datasets that are of fixed, consistent size it's usually best to use a row-like approach
-