What I Learned This Week #3
Data Models, Memory Access Patterns, Edge Computing, and Lexers.
👋 Hi, this is Gabriel with this week’s learnings. I write about software, startups and things that interest me enough to learn about. Thank you for your readership.
This week I’m sharing my top learnings about Data Models, Memory Access Patterns, Edge Computing, Lexers, and more. Hope it is helpful!
Learned while reading
Read Designing Data Intensive Applications Chapter 1 and 2, Understanding Software Dynamics Chapter 3 and 4, TursoDB’s Hrana Protocol Specification, Writing an Interpreter in Go Chapter 1.
Data models are extremely important. Different data models provide unique advantages depending on the inherent relationships among data elements and the expected access patterns required to support specific use cases. Does your data have a lot of many-to-many relationships? Well, modeling your data as a graph might offer you some advantages. Mostly one-to-many relationships? Perhaps you should model your data as documents. How you model your data not only defines how these relationships are physically stored, but will also define the performance characteristics for certain access patterns. For analytical workloads, you typically need to access a subset of columns for all rows rather than entire rows of data. In such cases, a column-oriented data model outperforms a row-oriented (relational) model. Querying specific columns (e.g., "SELECT column1, column2") will be faster compared to retrieving entire rows ("SELECT *") in a column-oriented setup.
Reading about data models reminded me of FoundationDB’s interesting concept of layers, where the data storage is decoupled from the data model. This approach allows FoundationDB to support various higher-level data models (e.g., documents, column-oriented, row-oriented, graphs) on top of a simple key-value (KV) data storage. The idea of using key-value pairs as the fundamental data model primitive is intriguing.
Memory access patterns have a substantial impact on computation time. This can be largely attributed to the fact that the chosen access pattern greatly influences caching behavior. Specifically, the access pattern determines both the frequency of cache misses/hits and the level (L1 Cache, L2 Cache, L3 Cache, or Main Memory) from which the final data is retrieved.
Many databases, such as Postgres, build their own proprietary protocols on top of raw TCP sockets to connect with their clients. This is because raw TCP sockets have a lower latency by maintaining persistent connections, eliminating the overhead of establishing new connections for each request (e.g., HTTP). However, in edge computing environments, raw TCP sockets are not really an option since they are restricted due to security concerns. WebSockets offer the same benefits while providing a more secure and standardized approach. This is why edge computing environments and platforms generally support WebSockets as a preferred protocol for real-time communication.
Lexers are the first transformation step in interpreters and compilers. Its primary function in an interpreter is to process source code and transform it into a series of tokens to give it structure in order to facilitate parsing. Tokens are the atomic units of meaning in the source code. It’s basically adding types to text.
Learned while listening to podcasts
Howard Schultz had an intuition that Japan was the next market for Starbucks. A board member advised him to hire an outside firm to conduct a study. The consultant reported that success in Japan was unlikely, stating, "The economics won't work." Today, there are almost 2,000 Starbucks in Japan. Moral of the story: ignore consultants.
Uri Levin emphasizes a lot how starting a company “is going to be a journey of failures”. The resilience and comfort with failing (quickly) seems to be one of his biggest competitive advantages. This sentiment aligns with Jensen Huang's quote: "If you want to be successful, I would encourage you to grow a tolerance for failure."
Ashley Kelly suggest that outbound sales and cold calling “is still a thing”. She advocates for the Sales Development Representative (SDR) role which focuses on outbound prospecting and lead generation. When you start working an account, SDRs “warm up” the client for the Account Executive to then close.