Home > Reflections | ⏮️ ⏭️

2024-06-12

Behavioral Questions

Tell me about your favorite project

This question is difficult to answer because it poses a one dimensional optimization problem in a space that is multidimensional.
There are a variety of aspects of any given project that I might enjoy. Some examples:

  1. Impact. It’s nice when a project improves people’s lives.
  2. Scale. It’s fun to work on systems that span thousands of servers across multiple continents or cater to millions of users.
  3. Learning. Some projects require learning interesting new skills or ideas.
  4. Technical excellence. I love the opportunity to apply advanced engineering tools and techniques.
  5. Product enjoyment. Typically, I make side projects for myself first. Making a game that I enjoy playing with friends, for example, is fun and satisfying.
  6. Teamwork. It can be a lot of fun to work with people.
  7. Leadership. And there’s a special kind of satisfaction in effectively leading a project that teammates and stakeholders enjoy.
  8. Challenge. Some projects seem nearly impossible at the outset. Meeting a challenge feels great.
    I should come back later and write out some stories about projects I’ve worked on in the past, highlighting these various points.

🏋 Practice

The Problem

1863. Sum of All Subset XOR Totals

The XOR total of an array is defined as the bitwise XOR of all its elements, or 0 if the array is empty.

  • For example, the XOR total of the array [2,5,6] is 2 XOR 5 XOR 6 = 1.
    Given an array nums, return the sum of all XOR totals for every subset of nums
    Note: Subsets with the same elements should be counted multiple times.
    An array a is a subset of an array b if a can be obtained from b by deleting some (possibly zero) elements of b.

🪞 Reflections

  1. This problem is labeled Easy, but I disagree.
  2. Observing that brute force is sufficient based on the size constraints on the input was key.
  3. Generating every subset is not easy to do. I had to look up techniques before implementing it. Maybe I should practice them. The bitmask technique at least seems memorable.
  4. The O(N) solution is kind of ridiculous. I imagine this problem was reverse-engineered into existence based on someone’s knowledge of this trick. That said, it would still be cool to be able to come up with it on my own. At least I knew where to look for a trick, even if I didn’t have the patience to thoroughly search and find it (nor would I have had the time to do so in an interview).
  5. My first solution was completely wrong, but at least I caught that during manual testing. Generating every subset is not trivial (see point 2 above).
  6. Was this a valuable use of time? I’m not sure. I did observe a couple of gaps in my knowledge. If I’m asked to generate every subset of a thing, I think I’ll be able to do it now.
  7. Oh yeah, bit-wise XOR on numbers isn’t very intuitive. I referenced the examples provided to determine the value of any given XOR operation (during manual testing) to avoid having to write out the bits and XOR them myself.

My Solutions

/* return Sum of all subset XOR totals  
- There are 2^n subsets of an array  
- The largest array we'll be provided is 12, thus 2^12 = 4096 subsets  
- The largest input value we'll get is 20  
  
- If brute force isn't fast enough, we'll need to find a trick to, e.g. relate the XOR total of a set with the XOR total of its subsets  
- We also need to be able to compute a bitwise XOR, which I think is `^`  
  
Observations  
- The XOR total of an empty array = 0  
- The XOR total of a single element array = the value of that element  
  
I'm not sure if brute force will be fast enough, but I'm not seeing any obvious patterns with XOR jump out for shortcuts, so let's try it out  
This will be O(2^N), exponential time. Hopefully the problems are small enough that this is acceptable.  
[8:45] done planning  
- actually, I think we can reuse smaller XOR totals from subsets to compute larger XOR totals; e.g. XOR_total(5,3,1) = XOR_total(XOR_total(5,3),1)  
That can save us a bit of recomputation.  
This way we'll only need to compute  
N - 1 pairwise XOR totals for each pair  
N - 2 pairwise XOR totals for each triple (reusing pairwise totals)  
N - 3 pairwise XOR totals for each quadruple  
... etc  
This will result in O(N^2) pairwise computations; 12^2 = 144  
Much more tractable.  
[25:15] first implementation complete  
[36:24] identified a logic bug: I'm only considering contiguous subsets  
*/  
function subsetXORSum1(nums: number[]): number { // 5 1 6  
  const cache = new Map()  
  const key = (ns: number[]): string => ns.sort().join(',')  
  let sum = 0  
  for (let c = 0; c < nums.length; c++) { // 2; 1; 0  
    for (let i = 0; i < nums.length - c; i++) { // 0; 1; 0; 2; 1; 0  
      const a = nums[i] // 5; 1; 5; 6; 1; 5  
      if (c === 0) {  
        sum += a // 12; 6; 5  
      } else if (c === 1) {  
        const b = nums[i + 1] // 6; 1  
        const xor = a ^ b // 7; 4  
        const k = key([a, b]) // 1,6; 1,5  
        cache.set(k, xor) // {1,5:4 1,6:7}  
        sum += xor // 23; 16  
      } else {  
        const kp = key(nums.slice(i + 1, i + c + 1)) // 1,6  
        const b = cache.get(kp) // 7  
        const xor = a ^ b // 2  
        const k = key(nums.slice(i, i + c + 1)) // 5,1,6  
        cache.set(k, xor) // {1,5:4 1,6:7 5,1,6:2}  
        sum += xor // 25  
      }  
    }  
  }  
  return sum  
};  
  
/* [54:02] just trying brute force to start  
- had to look up strategy to generate all subsets  
[1:13:22] finished implementation  
[1:20:10] finished 1 test  
Bugs found after testing  
1. missing fat arrow for helper function  
2. padStart does not exist on number  
3. cannot pass number to string param (padStart param 2)  
4. wrong answer [1,3] => 4 != 6 (iterate through bitmasks up to i < 2**xs.length (not - 1))  
5. wrong answer [5,1,6] => 14 != 28 (convert number to binary string by passing 2 to toString)  
[1:29:10] successful submission  
*/  
function subsetXORSum2(nums: number[]): number { // 5,1,6  
  const allNonEmptySubsets = (xs: number[]): number[][] => { // 5,1,6  
    const subsets = []  
    for (let i = 1; i < 2**xs.length; i++) { // 7; 6; 5; 4; 3; 2; 1  
      const bitMask = i.toString(2)  
        .padStart(xs.length, '0') // 111; 110; 101; 100; 011; 010; 001  
      const subset = []  
      for (let b = 0; b < bitMask.length; b++) {  
        if (bitMask[b] === '1') subset.push(xs[b]) // [5,1,6] [5,1] [5,6] [5] [1,6] [1] [6]  
      }  
      subsets.push(subset)  
    }  
    return subsets  
  }  
  
  const subsets = allNonEmptySubsets(nums)  
  
  const sum = subsets // 5 1 6 5,1 5,6 1,6 5,1,6  
    .reduce((sum, subset) => {  
      const xorTotal = subset.reduce((a, b) => a ^ b) // 2; 7; 3; 4; 6; 1; 5  
      return sum + xorTotal // 28; 26; 19; 16; 12; 6; 5  
    }, 0)  
  
  return sum // 28  
}  
  
// this is slower than not caching  
function subsetXORSum3(nums: number[]): number {  
  const allNonEmptySubsets = (xs: number[]): number[][] => {  
    const subsets = []  
    for (let i = 1; i < 2**xs.length; i++) {  
      const bitMask = i.toString(2)  
        .padStart(xs.length, '0')  
      const subset = []  
      for (let b = 0; b < bitMask.length; b++) {  
        if (bitMask[b] === '1') subset.push(xs[b])  
      }  
      subsets.push(subset)  
    }  
    return subsets  
  }  
  
  const cache = new Map()  
  const key = (xs: number[]): string => xs.sort().join(',')  
  
  const xor = (xs: number[]): number => {  
    if (xs.length === 1) return xs[0]  
    if (xs.length === 2) {  
      const result = xs[0] ^ xs[1]  
      cache.set(key(xs), result)  
      return result  
    }  
    const a = xs[0]  
    const k = key(xs.slice(1))  
    const b = cache.get(k)  
    const result = a ^ b  
    cache.set(key(xs), result)  
    return result  
  }  
  
  const sum = allNonEmptySubsets(nums)  
    .reduce((sum, subset) => sum + xor(subset), 0)  
  
  return sum  
}  
  
/* is there a trick?  
xor truth table:  
  
0 0 0  
0 1 1  
1 0 1  
1 1 0  
  
... doesn't seem helpful  
  
Can we somehow infer the sum of some subsets?  
... nothing I can think of  
  
  
what happens with non-algorithmic optimizations?  
i.e. don't explicitly generate the subsets, just iterate through them  
... perhaps predictably, this isn't much better  
*/  
function subsetXORSum4(nums: number[]): number {  
  let sum = 0  
  for (let i = 1; i < 2**nums.length; i++) {  
    const bitMask = i.toString(2).padStart(nums.length, '0')  
    let rollingXOR  
    for (let b = 0; b < bitMask.length; b++) {  
      if (bitMask[b] === '1') {  
        if (typeof rollingXOR === 'undefined') rollingXOR = nums[b]  
        else rollingXOR ^= nums[b]  
      }  
    }  
    sum += rollingXOR  
  }  
  return sum  
}  
  
/* read the solution  
I don't really follow the proof, but it's easy to implement the result.  
This doesn't really help, as there's no way I'd deduce this in an interview.  
*/  
const subsetXORSum = (nums: number[]): number =>  
  nums.reduce((a, b) => a | b, 0) << nums.length - 1  

Some Universal Design Principles

  1. Understand the context
    1. Identify key stakeholders, participants, and users
    2. Identify constraints: timelines, budgets, performance targets, outcomes
  2. Minimize complexity within the constraint parameters
  3. Resiliency
    1. Our primary path to resiliency is through redundancy
      1. Redundant data (typically distributed across independent data centers) reduces likelihood of data loss
      2. Redundant services reduces likelihood of downtime
  4. Scaling
    1. Prefer horizontal scaling for growth.
      1. Handle more users by adding more servers.
      2. When data grows too large, try splitting it into independent sets.
        … to be continued

A Framework for System Design Interviews

A friend shared this framework with me. It should help organize responses to design interview questions and ensure that I hit all the major points. I should practice writing out designs for a few systems using this framework before real interviews.
Also, this is mostly copy/paste, which isn’t great for retention. I should review the framework, rationalize it with principles and experiences I’m familiar with, come up with analogies, see if I can derive this or something similar myself.
I think it’s a nice framework. It’s probably a good sign that nothing here is particularly new or surprising, but it does a great job of organizing a lot of concepts I’m already familiar with.
I also appreciate the guidelines on how much time to spend in each section. If I can practice and learn to use this effectively, I can see it keeping me on track during a real interview.

1. Problem Exploration, Scope, and Functional Requirements (5 min)

  1. Identify & prioritize customers (internal vs external & size)
  2. Define business goals & metrics (revenue, user adoption)
  3. Identify use cases & functional requirements
  4. Identify cost factors (minimize cost of development vs cost of maintenance)

2. Non Functional Requirements (5 min)

  1. Performance (QPS - read vs write heavy, latency, data staleness)
  2. Availability (CAP, availability vs strong consistency, how many 9s?, fault tolerance: replication, conflict resolution, timeouts)
  3. Reliability (durable data, correct responses: verify with batch offline jobs)
  4. Security (pass tokens with requests)
  5. Maintainability (Monitoring, Profiling, Debuggability)
  6. Extensibility (Frameworks & reusable abstractions, plugin models)

3. API Design & Data Model (10 min)

  1. Handling bad actors
  2. API Safeguards (rate limiting, large response sizes, pagination)

4. Capacity Estimations (5 min)

  1. QPS (Read & Write; Peak & Average; Ratios)
  2. Storage size (1, 3, 5 years)
  3. Memory size (caching?)
  4. Network Bandwidth
  5. Disk IO?
  6. System bottlenecks (CPU/Memory/Disk/Network Bandwidth bound)

5. Draw Components & Deep Dive (15 min)

  1. Draw boxes and lines.
  2. Annotate major request flows.