Chaotic Thoughts

Data Sharding(Partitioning) Algorithms

September 30, 2024

I used to work close with incredibly smart people who was dealing with things like data sharding on daily basis from them I learned a lot on that topic. Later I moved to a different role where that knowledge was not needed and faded away over the time. Here I’m trying to reclaim to myself that long forgotten knowledge.

Intro

Sharding is a process of assigning an item to a shard - a smaller chunk of data out of a large database or other service. The general idea is that we can distribute data or service across multiple locations and handle large volumes of data or handle more requests and with replication we can scale even more and make the system more resilient etc. But we need to have clear rules on how we assign partitions aka shards so that we can route requests to the right location.

Interview Question: logs parsing library

September 29, 2024

One of the question I often ask in my interview is to design a log processing library:

You need to write a library for processing logs in the following format:

timestamp<TAB>message

The library will be handed over to a different team for further maintenance and improvements and so maintainability and expandability is the most most important requirements.

The library need to support the following operations out of the box:

filtering
counting
histograms

The original version also included some language and background specific expectations I never include in my assessment because I feel that they put the candidate into a position when they need to read my mind to meet those expectations.

An Interview Question: Optimization of Disk Read costs

September 24, 2024

One of the questions are really love asking during coding interviews is this:

Given a continuous stream of words, a dictionary on disk and cost associated to read from disk, create a stream processor that returns true when a word exists in the dictionary while minimizing the cost of reading from disk. Example:

Dictionary: {Dog, Cat, Bird, Lion, ...}
Input: [Dog, Cat, Aghd, ...]
Output: [True, True, False, ...]

The output is true, true, false because dog and cat exist in the dictionary of words while Aghd is not considered a word.

Writing User Stories and Requirements

September 18, 2024

Recently I was reading though a bunch of technical designs and I’ve noticed a common mistake when it comes to writing user-stories and requirements - assuming a solution. The biggest issue for me when I write requirements myself is that than whenever I include a part of a solution I’m thinking about into the requirements it limits my ability to innovate since I’m bound to a specific solution. In many cases I observed improvements in my designs when I was focused on what the customer needs rather on fulfilling requirement tied to my first and probably not a bright idea.

High Tech Energy (Attention) Vampires

September 15, 2024

It is interesting to observe that any endeavor where attention is one of key metrics or key drivers. Regardless of the company size end up in the same hell pit of attention craving and optimization for it. Even small single person blogs that teach us to be a better person, engineer, or somethings are prone to that. Many of them, those I used, slowly became “Energy Vampires” to me constantly seeking for my attention.

Cache eviction policies: LRU vs random vs p2c

September 15, 2024

Every so often I interview senior software engineers for Amazon. Where I ask more or less same questions in each interview. One of them requires adding a caching logic to get better results. I’ve noticed that the interviewee make one of to mistakes that blocks them from standing out as a software engineers:

they don’t know, or talk about conditions under which a cache will do the best. Primarily, how a request frequency distribution affects cache performance.
they don’t know the standard library of a programming language of their choice.

Here, we will try to address those issues.

Ultra-locality in Decision Making and Free Will

March 23, 2024

This time we explore the wonderful world of ultra-locality in decision-making and its connection to free will, good, evil, and God.

Part One: Ultra-locality and Free Will

The same Joscha Bach: Life, Intelligence, Consciousness, AI & the Future of Humans | Lex Fridman Podcast #392 podcast I mentioned in the previous post AI, people, trees, and mushrooms: the same software different hardware triggered another chain of thought. Joscha was talking about how our neurons always operate using data available right here and right now. That is enough to build complex systems like the human brain. Working together, neurons form parts responsible for memories, image processing, data buses, etc. But ultimately, each of them individually works only with data provided by other neurons. In a similar fashion, neural networks in GPTs are just a multiplication of matrices connected with each other, forming memories, attention, generation, etc.

AI, people, trees, and mushrooms: the same software different hardware

March 16, 2024

Exploring the idea that all living things have spirit or the ability to run neurological signals.

Recently, I listened to Joscha Bach: Life, Intelligence, Consciousness, AI & the Future of Humans | Lex Fridman Podcast #392 where Joscha and Lex discussed different ideas about consciousness, neurology, and AI. At one point, they talked about the ability of all types of cells to process neurological signals. The key difference is that neurons can process data much faster, over longer distances, and interact with more neighbors at once.

The Matrix: A Simulation, a Game, Reincarnation, or Hallucination?

March 9, 2024

For quite some time, a few thoughts have been haunting me. What if all we see doesn’t exist, and we are all hostages or participants of a game.

The Beginning: The Matrix and Rick and Morty

It all started a while ago. In high school, I was into cyberpunk, reading and watching about hackers, virtual reality, etc. Neuromancer, Johnny Mnemonic by William Gibson, Labyrinth of Reflections by Sergei Lukyanenko, The Matrix, and The Lawnmower Man were my go-to entertainment. Then I forgot about it until the recent generative AI explosion and… a few episodes of Rick and Morty. In one episode, Morty plays a VR game where he starts as a newborn with no memories of life outside the game and lives an entire life until he dies at 60-80 years old. In another episode, he gets stuck in a game where his consciousness is fragmented into pieces, acting as an entire world of independent agents.

Lean development, customers and how it is connected to Amazon writing culture

March 5, 2024

Amazon is famous for its writing culture which I discovered later in my career. The more I wrote the easier it was to apply the similar approach to other aspects of software development.

Introduction

When I joined Amazon, I transitioned from a company with a markedly different culture, particularly in writing and development processes. Initially, my feelings toward Amazon’s writing-centric culture were skeptical. However, over the next seven years, I gradually embraced and excelled in the Amazon style of writing.