# Liwei Wu...

always curious about the world around me

## Leetcode Summary

to write down ideas and summarize techniques when practicing leetcode questions

### Short-term Goal

#### Posted on May 12, 2018

I think I need to re-start practicing the leetcode questions, given that I need to start to interview for full-time jobs pretty soon after the Facebook internship. Cannot believe that the last time I did any question is back in November last year.

### Summary of Commonly Used C++ Syntax for Leetcode (Unfinished and Keep Updating)

#### Posted on Oct 10, 2017 && Last Updated on Nov 4, 2017

Since I am so used to searching everything whenever I need them, I think it may be a good idea to summarize what I have used so far in Leetcode. One can read more on range-based for loop here and lvalue, rvalue here. Also, for template coding of functions and classes, one can refer to the tutorial here.



// String
string name ("John");
name += " K. ";
name += '\n';
string str = "2001";
int i = stoi(str);
string pi = "pi is " + to_string(3.1415926);

// Array
int numbers[] = {20,40,50,10,30};
// sort (RandomAccessIterator first, RandomAccessIterator last, Compare comp);
// comp is binary function (or can either be a function pointer or a function object)
// Note: function object (functor) is struct with function call operator () defined
sort (numbers, numbers + 5, std::greater<int>());
int cx = std::count_if (numbers, numbers+5, std::bind2nd(std::greater_equal<int>(),0));
int a = 5, b = 3;
swap(a, b);

// Vector
vector<vector<bool>> dp(m + 1, vector<bool>(n + 1, false));    // how to initialize two-dim vector
vector<int> v(3,100);
size_t len = v.size();
v.push_back(100);
v.pop_back();
// one can use auto in c++11, and insert is not good for vector, use list
vector<int>::iterator it = v.begin();
it = v.insert(it, 200);

// pair
pair<string, double> product("tomatoes",2.30);
product.first = "shoes";
product.second = 39.90;
pair<int,int> foo;
product = make_pair(10,20);

// list (use deque if inserting front or back only)
// deque has similiar syntax but perform worse
// for insertion or removals of elements at positions other than the beginning or the end
list<int> l1(4, 100);
list<int> mylist1;
for (int i = 1; i <= 5; ++i)
mylist1.push_back(i);     // mylist1: 1 2 3 4 5
mylist1.erase(mylist1.end());
// same as mylist1.pop_back();    // mylist1: 1 2 3 4
// see push_back(100), pop_back(), push_front(100), pop_front()
list<int> l2{10, 20, 30};     // mylist2: 10 20 30
auto it = mylist1.begin();
mylist1.splice (it, mylist2);   // mylist1: 1 10 20 30 2 3 4
// mylist2: empty
// "it" still points to 2 (the 5th element)
mylist2.splice (mylist2.begin(),mylist1, it);
// mylist1: 1 10 20 30 3 4
// mylist2: 2
// "it" is now invalid.

// map and unordered_map
map<char,int> mymap;
map<char,int>::iterator it;
// insert some values:
mymap['a']=10;
mymap['b']=20;
mymap['c']=30;
mymap['d']=40;
mymap['e']=50;
mymap['f']=60;
it = mymap.find('b');
mymap.erase (it);                   // erasing by iterator
mymap.erase ('c');                  // erasing by key
it = mymap.find ('e');
mymap.erase ( it, mymap.end() );    // erasing by range
// show content:
for (it = mymap.begin(); it != mymap.end(); ++it) {
cout << it->first << " => " << it->second << '\n';
}
// same syntax for unordered_map, except map has Compare = less<Key> as
// default while unordered_map has none and is therefore faster.

// set and unordered_set
unordered_set<string> s({"orange","pink","yellow"});
unordered_set<string> myset = {"yellow","green","blue"};
string mystring = "red";
myset.insert (mystring);
// also use find, count as map/unordered_map
unordered_set<string> myset = {"Mercury","Venus","Earth",
"Mars","Jupiter","Saturn","Uranus","Neptune"};
cout << "myset contains:";
for (auto it = myset.begin(); it != myset.end(); ++it)
cout << " " << *it;
cout << std::endl;

// stack
stack<int> mystack;
mystack.push(10);
mystack.push(20);
mystack.top() -= 5;
mystack.pop();

// queue (<class T, class Container = deque<T>>)
queue<int> myqueue;
myqueue.push(3);
myqueue.push(4);
while (!myqueue.empty()) {
int tmp = myqueue.front();
myqueue.pop();
}

// priority_queue
priority_queue<int, vector<int>, greater<int>> pq;
priority_queue<int> mypq;
mypq.push(30);
mypq.push(25);
while (!mypq.empty()) {
cout << ' ' << mypq.top();
mypq.pop();
}
// cannnot iterate priority_queue, has to pop out one by one
// https://stackoverflow.com/questions/20826078/priority-queue-comparison
std::priority_queue<int, std::vector<int>, std::less<int> > pq;
// priority queue
struct cmp {
bool operator () (ListNode* a, ListNode* b) {
return a->val > b->val;
}
};

struct great {
bool operator () (int a, int b) {
return a > b;
}
};
sort(m.begin(), m.end(), great());
// I prefer functor over function pointer (i.e. lambda operator combined with
// decltype, which can extract type of varaible)
priority_queue<ListNode*, vector<ListNode*>, cmp> pq;

auto cmp = [](pair<string, int>& a, pair<string, int>& b) {
return a.second < b.second || (a.second == b.second && a.first > b.first);
};
priority_queue<pair<string,int>, vector<pair<string,int>>, decltype(cmp)> q(cmp);


### Leetcode 311. Sparse Matrix Multiplication

#### Posted on Nov 12, 2017

To be honest, I feel it stupid to store sparse matrix in the dense form and then try to optimize speed. I think it is a good question only if both input and output form is COO, CSC or CSR format.


class Solution {
public:
vector<vector<int>> multiply(vector<vector<int>>& A, vector<vector<int>>& B) {
vector<vector<int>> res(A.size(), vector<int>(B[0].size()));
for (int i = 0; i < A.size(); ++i) {
for (int k = 0; k < A[0].size(); ++k) {
if (A[i][k] != 0) {
for (int j = 0; j < B[0].size(); ++j) {
if (B[k][j] != 0) res[i][j] += A[i][k] * B[k][j];
}
}
}
}
return res;
}
};

class Solution {
public:
vector<vector<int>> multiply(vector<vector<int>>& A, vector<vector<int>>& B) {
vector<vectorlt;int>> res(A.size(), vector<int>(B[0].size()));
vector<vector<pair < int, int>>> v(A.size(), vector<pair<int,int>>());
for (int i = 0; i < A.size(); ++i) {
for (int k = 0; k < A[i].size(); ++k) {
if (A[i][k] != 0) v[i].push_back({k, A[i][k]});
}
}
for (int i = 0; i < A.size(); ++i) {
for (int k = 0; k < v[i].size(); ++k) {
int col = v[i][k].first;
int val = v[i][k].second;
for (int j = 0; j < B[0].size(); ++j) {
res[i][j] += val * B[col][j];
}
}
}
return res;
}
};


### Leetcode 347. Top K Frequent Elements

#### Posted on Nov 12, 2017

See description of previous question.


class Solution {
public:
vector<string> topKFrequent(vector<string>& words, int k) {
vector<string> res;
unordered_map<string, int> freq;
vector<set<string>> v(words.size() + 1, set<string>());
for (string word : words) ++freq[word];
for (auto a : freq) {
v[a.second].insert(a.first);
}
for (int i = v.size() - 1; i >= 0; --i) {
if (k <= 0) break;
vector<string> t(v[i].begin(), v[i].end());
if (k >= t.size()) {
res.insert(res.end(), t.begin(), t.end());
} else {
res.insert(res.end(), t.begin(), t.begin() + k);
}
k -= t.size();
}
return res;
}
};


### Leetcode 347. Top K Frequent Elements

#### Posted on Nov 12, 2017

Borrow idea from finding k largest elements in an array / in a stream (see here). There, the optimal solution is to use min heap of size k. (If finding k-th largest element in an array, then one should use quickselect algorithm). This approach has time complexity of $O(nlogk)$ which is useful when k is much smaller than n. Space we need only $O(k)$ compared to $O(n)$. If space is not an issue, then using counting sort idea is useful. A similar question is finding k most frequent words in a file (see ereh) and one need to turn to trie tree to store the counts efficiently.


struct cmp {
bool operator() (pair<int, int>& a, pair<int, int>& b) {
return a.first > b.first;
}
};

class Solution {
public:
vector<int> topKFrequent(vector<int>& nums, int k) {
unordered_map<int, int> cnts;
for (int x : nums) ++cnts[x];
// use min heap of size k, time takes O(nlogk)
// only use O(k) memory instead of O(n) in other approaches
priority_queue<pair<int, int>, vector<pair<int, int>>, cmp> pq;
for (auto it = cnts.begin(); it != cnts.end(); ++it) {
int cnt = it->second, val = it->first;
cout << cnt << " " << val << endl;
if (pq.empty() || pq.size() < k) {
pq.emplace(cnt, val);
} else {
//cout << "top" << pq.top().first << endl;
if (pq.top().first < cnt) {
pq.pop();
pq.emplace(cnt, val);
}
}
}
vector<int> res;
// O(klogk)
while (!pq.empty()) {
pair<int, int> t = pq.top();
pq.pop();
res.push_back(t.second);
}
sort(res.begin(), res.end());  // O(klog(k))
return res;
}
};

class Solution {
public:
vector<int> topKFrequent(vector<int>& nums, int k) {
unordered_map<int, int> m;
vector<vector> bucket(nums.size() + 1);
vector<int> res;
for (auto a : nums) ++m[a];
for (auto it : m) {
bucket[it.second].push_back(it.first);
}
for (int i = nums.size(); i >= 0; --i) {
for (int j = 0; j < bucket[i].size(); ++j) {
res.push_back(bucket[i][j]);
if (res.size() == k) return res;
}
}
return res;
}
};


### Leetcode 187. Repeated DNA Sequences

#### Posted on Nov 10, 2017

If we do not need to worry about space we used for hashtable, then the first solution is okay. However, it is a huge waste of time to use 8 bits for each A, C, G, T, especially when there are a lot of those. We can simply use two bits to represent.


class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
unordered_set<string> res, st;
for (int i = 0; i + 10 <= s.size(); ++i) {
string t = s.substr(i, 10);
if (st.count(t)) res.insert(t);
else st.insert(t);
}
return vector<string>{res.begin(), res.end()};
}
};

class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
unordered_set<string> res;
unordered_set<int> st; // store those appearing at least once
unordered_map<int, int> m{{'A', 0}, {'C', 1}, {'G', 2}, {'T', 3}};
int cur = 0, i = 0;
while (i < 9) cur = cur << 2 | m[s[i++]];
while (i < s.size()) {
// in total, we use 20 bits for 10 symbols, and mask is 18 bits
// therefore the mask is 0x3ffff or 0b11...11 (in total 18 1's)
cur = ((cur & 0x3ffff) << 2) | (m[s[i++]]);
if (st.count(cur)) res.insert(s.substr(i - 10, 10));
else st.insert(cur);
}
return vector<string>(res.begin(), res.end());
}
};


### Leetcode 677. Map Sum Pairs

#### Posted on Nov 8, 2017

Again, depends on which API Calls are more often: insert or sum.


public:
/** Initialize your data structure here. */
MapSum() {

}

void insert(string key, int val) {
m[key] = val;
}

int sum(string prefix) {
int res = 0, n = prefix.size();
for (auto it = m.lower_bound(prefix); it != m.end(); ++it) {
if (it->first.substr(0, n) != prefix) break;
res += it->second;
}
return res;
}

private:
map<string, int> m;
};

class MapSum {
public:
/** Initialize your data structure here. */
MapSum() {

}

void insert(string key, int val) {
int diff = val - m[key].first, n = key.size();
m[key].first = val;
for (int i = n - 1; i > 0; --i) {
m[key.substr(0, i)].second += diff;  // only need to add the diff, due to overwrite
}
}

int sum(string prefix) {
return m[prefix].first + m[prefix].second;
}

private:
// in the pair, first represents the value for exact match and second represents
// prefixes containing but not exact match
unordered_map<string, pair<int, int>> m;
};


### Leetcode 159. Longest Substring with At Most Two Distinct Characters

#### Posted on Nov 8, 2017

Using a sliding window with a hashmap. One can either use the hashmap to store the counts, or one can use the hashmap to store the last index of apperance. But the idea is the same.


// store the counts
class Solution {
public:
int lengthOfLongestSubstringTwoDistinct(string s) {
int res = 0, left = 0;
unordered_map<char, int> m;
for (int i = 0; i < s.size(); ++i) {
++m[s[i]];
while (m.size() > 2) {
if (--m[s[left]] == 0) {
m.erase(s[left]);
}
left++;
}
res = max(res, i - left + 1);
}
return res;
}
};

// store the last index of apperance
class Solution {
public:
int lengthOfLongestSubstringTwoDistinct(string s) {
int res = 0, left = 0;
unordered_map<char, int> m;
for (int i = 0; i < s.size(); ++i) {
m[s[i]] = i;
while (m.size() > 2) {
if (m[s[left]] == left) m.erase(s[left]);
++left;
}
res = max(res, i - left + 1);
}
return res;
}
};


### Leetcode 406. Queue Reconstruction by Height

#### Posted on Nov 8, 2017

The idea is to sort the people by height from highest to lowest and then by the number of poeple with higher or equal height in front of him from largest to smallest. Then insert them one by one using the second index as offset from the front. Here, we assume the values given are indeed valid. Using vector may not be the most efficient way compared to using linked list.


class Solution {
public:
vector<pair<int, int>> reconstructQueue(vector<pair<int, int>>& people) {
sort(people.begin(), people.end(), [](const pair<int, int>& a, const pair<int, int>& b) {
return a.first > b.first || (a.first == b.first && a.second < b.second);
});
vector<pair<int, int>> res;
for (auto a : people) {
res.insert(res.begin() + a.second, a);
}
return res;
}
};


### Leetcode 162. Find Peak Element

#### Posted on Nov 7, 2017

If one really understands binary search, this medium level questions become an easy one as soon as one realize it is just a sequnce of no's followed by yes's using the predicate nums[mid] >= nums[mid + 1]


class Solution {
public:
int findPeakElement(vector<int>& nums) {
int left = 0, right = nums.size() - 1;
while (left < right) {
int mid = left + (right - left) / 2;
if (nums[mid] >= nums[mid + 1]) {
right = mid;
} else {
left = mid + 1;
}
}
return left;
}
};


### Leetcode 369. Plus One Linked List

#### Posted on Nov 6, 2017

This question I think is quite interesting. At first look, it is not quite simple. Since for linked lists, it is hard to traverse in the reverse way. So one naive idea is to reverse the list first and performing "add 1" operation before reversing back to the original order. It is okay but not very efficient. The following method is much better, since it can change the node value in the recursion and returns 1 if carry exists, otherwise return 0 but modifying the values in place.


/**
* Definition for singly-linked list.
* struct ListNode {
*     int val;
*     ListNode *next;
*     ListNode(int x) : val(x), next(NULL) {}
* };
*/
class Solution {
public:
ListNode* plusOne(ListNode* head) {
int carry = helper(head);
if (carry == 0) return head;
ListNode* res = new ListNode(1);
return res;
}

int helper(ListNode* head) {
if (!head) return 1;
int carry = helper(head->next);
int sum = head->val + carry;
head->val = sum % 10;
//cout << sum;
return sum / 10;
}
};


### Leetcode 568. Maximum Vacation Days

#### Posted on Nov 6, 2017

This question should be solved using dynamic programming technique. Usually when the space complexity in $O(n^2)$, then it should be optimized till $O(2n)$ using two vectors instead one matrix. And here, one can loop through the matrix from day 0 till day $k - 1$ or reversely. The solution below is doing it from day 0, and therefore we should initialize the dp array at day -1 by making only the starting point to be 0, the rest to be INT_MIN.


class Solution {
public:
int maxVacationDays(vector<vector<int>>& flights, vector<vector<int>>& days) {
// flights: n by n, days: n by k
int n = flights.size(), k = days[0].size();
vector<int> dp(n, INT_MIN);
dp[0] = 0;
for (int j = 0; j < k; ++j) {
// dp stores the previous week maximum number of holidays
// max over staying at the same city or going somewhere else
vector<int> t(n, INT_MIN);
for (int p = 0; p < n; ++p) {
for (int i = 0; i < n; ++i) {
if (i == p || flights[p][i]) {
t[i] =  max(t[i], dp[p] + days[i][j]);
}
}
}
dp = t;
}
int res = 0;
for (int i = 0; i < dp.size(); ++i) {
//cout << dp[i] << " ";
res = max(res, dp[i]);
}
return res;
}
};


### Leetcode 681. Next Closest Time

#### Posted on Nov 5, 2017

Since 1440 is a very small number, we can actually do brute force and check if the next time has all digits in original time string.


class Solution {
public:
string nextClosestTime(string time) {
string res = "0000";
vector<int> v{600, 60, 10, 1};
int found = time.find(":");
int cur = stoi(time.substr(0, found)) * 60 + stoi(time.substr(found + 1));
for (int i = 1, d = 0; i <= 1440; ++i) {
int next = (cur + i) % 1440;
for (d = 0; d < 4; ++d) {
res[d] = '0' + next / v[d];
next %= v[d];
if (time.find(res[d]) == string::npos) break;
}
if (d >= 4) break;
}
return res.substr(0, 2) + ":" + res.substr(2);
}
};


### Leetcode 683. K Empty Slots

#### Posted on Nov 5, 2017


class Solution {
public:
int kEmptySlots(vector<int>& flowers, int k) {
// flowers record day (starting from 0), flower (starting from 1) key-value pair
// days record flower(starting from 0), day (starting from 1) key-value pair
vector<int> days(flowers.size(), 0);
for (int i = 0; i < flowers.size(); ++i) {
days[flowers[i] - 1] = i + 1;
}
/*for (int i = 0; i < days.size(); ++i)
cout << days[i] << " ";*/

// the problem is converted to following problem now
// days[i] > days[left] && days[i] > days[right] for any left < i < right
// and we require right = left + k + 1
int left = 0, right = k + 1;
int res = INT_MAX;
for (int i = 1; right < days.size(); ++i) {
if (days[i] > days[left] && days[i] > days[right]) {
continue;
}
// we reach boundary of sliding window and found what we want
if (i == right) {
res = min(res, max(days[left], days[right]));
}
// days[i] does not satisfy the criteria
// days: 2 4 1 3 2 1 and k = 2
//       [     ]
// can skip 4 1 3 2, go to 1 3 2 1 directly,
// due to the fact 4 > 2 (or 3) and 1 < 2 (or 3) => 1 < 4
left = i;
right = i + k + 1;

}
return (res == INT_MAX) ? -1 : res;
}
};


### Leetcode 305. Number of Islands II

#### Posted on Nov 5, 2017

Using unionFind to solve the problem instead of bfs and dfs. UnionFind has time complexity of $O((M + N) \log N)$ for M operations and N nodes.


class Solution {
public:
vector<int> numIslands2(int m, int n, vector<pair<int, int>>& positions) {
vector<int> res;
if (m <= 0 || n <= 0) return res;
vector<int> roots(m * n, -1);
int cnt = 0;
vector<vector<int> > dirs{{0, -1}, {-1, 0}, {0, 1}, {1, 0}};
for (auto a : positions) {
int id = n * a.first + a.second;
roots[id] = id;
++cnt;
for (auto d : dirs) {
int x = a.first + d[0], y = a.second + d[1];
int cur_id = n * x + y;
if (x < 0 || x >= m || y < 0 || y >= n || roots[cur_id] == -1) continue;
int new_id = findRoots(roots, cur_id);
if (id != new_id) {
roots[id] = new_id;
id = new_id;
--cnt;
}
}
res.push_back(cnt);
}
return res;
}

int findRoots(vector<int> &roots, int id) {
while (id != roots[id]) {
roots[id] = roots[roots[id]];
id = roots[id];
}
return id;
}
};


### Leetcode 463. Island Perimeter

#### Posted on Nov 4, 2017

It is wrong to only find the maximum horizontal and vertical range. One approach is to scan for all i, j and only examine neighbors to the left and to the top. The other approach is doing DFS as before but need to pass the int ans by pointer (or by reference) and update it during recursion. The advantage of second approach over first one is when we know there is only one connected island within in very large rectangle. Then we do not need to travese every single little in the huge rectangle, which could be a waste of time.


class Solution {
public:
int islandPerimeter(vector<vector<int>>& grid) {
int res = 0;
if (grid.empty() || grid[0].empty()) {
return res;
}
for (int i = 0; i < grid.size(); ++i) {
for (int j = 0; j < grid[0].size(); ++j) {
if (grid[i][j] == 1) {
res += 4;
if (i != 0 && grid[i - 1][j] == 1) {
// here, substracting 2 because the side has been added twice by
// the square to left and this square
res -= 2;
}
if (j != 0 && grid[i][j - 1] == 1) {
res -= 2;
}
}
}
}
return res;
}
};

class Solution {
public:
void dfs(vector<vector<int>>& b, int *ans, int i, int j) {
if (i < 0 || i >= b.size() || j < 0 || j >= b[0].size() || b[i][j] != 1)
return;
b[i][j] = -1; // mark it as visited
*ans += (j + 1 >= b[0].size() || b[i][j+1] == 0) /* right */ +
(i - 1 < 0            || b[i-1][j] == 0) /* top */ +
(j - 1 < 0            || b[i][j-1] == 0) /* left */ +
(i + 1 >= b.size()    || b[i+1][j] == 0) /* bottom */;
dfs(b, ans, i, j + 1);
dfs(b, ans, i - 1, j);
dfs(b, ans, i, j - 1);
dfs(b, ans, i + 1, j);
return;
}
int islandPerimeter(vector<vector<int>>& grid) {
int ans = 0, i, j;
for (i = 0; i < grid.size(); i++) {
for (j = 0; j < grid[0].size(); j++) {
if (grid[i][j]) {
dfs(grid, &ans, i, j);
return ans;
}
}
}
return 0;
}

};


### Leetcode 200. Number of Islands

#### Posted on Nov 4, 2017

Either by BFS or DFS.


class Solution {
public:
int numIslands(vector<vector<char>>& grid) {
if (grid.empty() || grid[0].empty()) {
return 0;
}
int m = grid.size(), n = grid[0].size(), res = 0;
vector<vector<bool>> visited(m, vector<bool>(n, false));
for (int i = 0; i < m; ++i) {
for (int j = 0; j < n; ++j) {
if (grid[i][j] == '1' && visited[i][j] == false) {
numIslandsDFS(grid, visited, i, j);
res++;
}
}
}
return res;
}

void numIslandsDFS(vector<vector<char>>& grid, vector<vector<bool>>& visited,
int i, int j) {
// At first, I made a bug by i <= 0, j <= 0
// So stupid
if (i < 0 || i >= grid.size()) return;
if (j < 0 || j >= grid[0].size()) return;
if (grid[i][j] == '0' || visited[i][j] == true) return;
visited[i][j] = true;
numIslandsDFS(grid, visited, i - 1, j);
numIslandsDFS(grid, visited, i + 1, j);
numIslandsDFS(grid, visited, i, j - 1);
numIslandsDFS(grid, visited, i, j + 1);
}
};


### Leetcode 155. Min Stack

#### Posted on Nov 4, 2017

I think previously when I did it, I used the two stacks approach, which is okay but not as compact as the codes below. Basically the idea is that one can store the previous min_val in the same stack, separting the actual values.


class MinStack {
public:
/** initialize your data structure here. */
MinStack() {
min_val = INT_MAX;
}

// -2 0 -3 -3
// INT_MAX -2 0 -2 -3  -3 -3
void push(int x) {
// Note that here, we should include equal sign
if (x <= min_val) {
s.push(min_val);
min_val = x;
}
s.push(x);
}

void pop() {
int x = s.top();
s.pop();
if (x == min_val) {
min_val = s.top();
s.pop();
}
}

int top() {
return s.top();
}

int getMin() {
return min_val;
}

private:
int min_val;
stack<int> s;
};

/**
* Your MinStack object will be instantiated and called as such:
* MinStack obj = new MinStack();
* obj.push(x);
* obj.pop();
* int param_3 = obj.top();
* int param_4 = obj.getMin();
*/


### Leetcode 4. Median of Two Sorted Arrays

#### Posted on Nov 4, 2017

This question is pretty hard. It uses the divide and conquer idea. Finding k-th element at each step can be simplied as finding only one half of each array. The reason is that if nums1[m1 - 1] < nums2[m2 - 1], then the k-th element has to be to the right of cut in the num1, and to the left of the cut in the nums2. Otherwise, proof by contradition, something goes wrong.


class Solution {
public:
// 3
// 1 2

// 1 2
// 3 4
int findKth(vector<int>& nums1, int i, vector<int>& nums2, int j, int k) {
if (nums1.size() - i > nums2.size() - j) {
return findKth(nums2, j, nums1, i, k);
}
if (nums1.size() == i) {
return nums2[j + k - 1];
}
if (k == 1) {
return min(nums1[i], nums2[j]);
}
int m1 = min(i + k / 2, static_cast<int>(nums1.size()));
// i + j + k = m1 + m2
int m2 = i + j + k - m1;
if (nums1[m1 - 1] < nums2[m2 - 1]) {
// this means k-th element is to the right of nums1 but to the left of nums2
// excluding cut-off point, since it cannot be the res
return findKth(nums1, m1, nums2, j, k - (m1 - i));
} else if (nums1[m1 - 1] > nums2[m2 - 1]) {
return findKth(nums1, i, nums2, m2, k - (m2 - j));
} else {
return nums1[m1 - 1];
}
}

double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
int total = nums1.size() + nums2.size();
if (total % 2 == 1) {
return static_cast<double>(findKth(nums1, 0, nums2, 0, (total + 1) / 2));
} else {
return static_cast<double>(findKth(nums1, 0, nums2, 0, total / 2)
+ findKth(nums1, 0, nums2, 0, total / 2 + 1)) / 2.0;
}
}
};


### Leetcode 535. Encode and Decode TinyURL

#### Posted on Sep 10, 2017

Basically we want to use $62^6 = 56,800,235,584$ to represent all the URLs for all the websites (1.6 billion as of 2016 so 6 digits are sufficient). To achieve this, first idea is to design a hashing function. One can use the codes below to avoid conflicts but it will be slow eventually, because towards the end, it will be harder and harder to find approporiate shorturl. Similar idea, one can use md5/sha1 which takes "longurl + timestamp" as key and regenerate hash codes when there are conflicts. Of course, when data gets very large and exceed memory size of one machine, one should use distributed SQL or NoSQL database to store key-value pair. (also depending on the peak QPS: Queries Per Second).

See here for more details on the use of rand(), and srand() to generate pseudo-random number.


class Solution {
public:
Solution() {
dict = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
short2long.clear();
long2short.clear();
srand(time(nullptr));
}

// Encodes a URL to a shortened URL.
string encode(string longUrl) {
if (long2short.find(longUrl) != long2short.end()) {
return "http://tinyurl.com/" + long2short[longUrl];
}
int idx;
string randStr;
for (int i = 0; i < 6; ++i) {
randStr.push_back(dict[rand() % 62]);
}
while (short2long.find(randStr) != short2long.end()) {
randStr[idx] = dict[rand() % 62];
idx = (idx + 1) % 5;
}
short2long[randStr] = longUrl;
long2short[longUrl] = randStr;
return "http://tinyurl.com/" + randStr;
}

// Decodes a shortened URL to its original URL.
string decode(string shortUrl) {
string randStr = shortUrl.substr(shortUrl.rfind("/") + 1);
return short2long.find(randStr) == short2long.end() ? shortUrl : short2long[randStr];
}

private:
string dict;
unordered_map<string, string> short2long, long2short;
};


### Leetcode 534. Design TinyURL


public class URLService {
HashMap<String, Integer> ltos;
HashMap<Integer, String> stol;
static int COUNTER;
String elements;
URLService() {
ltos = new HashMap<String, Integer>();
stol = new HashMap<Integer, String>();
COUNTER = 1;
elements = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
}
public String longToShort(String url) {
String shorturl = base10ToBase62(COUNTER);
ltos.put(url, COUNTER);
stol.put(COUNTER, url);
COUNTER++;
return "http://tiny.url/" + shorturl;
}
public String shortToLong(String url) {
url = url.substring("http://tiny.url/".length());
int n = base62ToBase10(url);
return stol.get(n);
}

public int base62ToBase10(String s) {
int n = 0;
for (int i = 0; i < s.length(); i++) {
n = n * 62 + convert(s.charAt(i));
}
return n;

}
public int convert(char c) {
if (c >= '0' && c <= '9')
return c - '0';
if (c >= 'a' && c <= 'z') {
return c - 'a' + 10;
}
if (c >= 'A' && c <= 'Z') {
return c - 'A' + 36;
}
return -1;
}
public String base10ToBase62(int n) {
StringBuilder sb = new StringBuilder();
while (n != 0) {
sb.insert(0, elements.charAt(n % 62));
n /= 62;
}
while (sb.length() != 6) {
sb.insert(0, '0');
}
return sb.toString();
}
}


### Leetcode 38. Count and Say

#### Posted on Oct 30, 2017

This question can be done in both recursion way and iterative way. Note that the use of while loop inside the for loop: it is meant to find the last index of repetitive digits. Because that is when one should append to the res string.


class Solution {
public:
string countAndSay(int n) {
if (n <= 0) return "";
string res = "1";
while (--n) {
string cur = "";
for (int i = 0; i < res.size(); ++i) {
int cnt = 1;
// "1211"
while (i + 1 < res.size() && res[i] == res[i + 1]) {
++cnt;
++i;
}
cur += to_string(cnt) + res[i];
}
res = cur;
}
return res;
}
};


### Leetcode 301. Remove Invalid Parentheses

#### Posted on Oct 30, 2017

The idea of using BFS is not hard to come up with, since we are basically trying to remove as few as parentheses as possible (moving as few levels down as possible if we treat each level down is one fewer parethesis). The trickly part is that we need to use a boolean value to stop pushing trimmed string to the queue when reaching the first leaf node. The reason is that for input ")()()" we only want "()()" but not "()" or "" and this is achieved by using this boolean value. Because we are doing BFS, this can easily be done.


class Solution {
public:
bool isValid(string s) {
int cnt = 0;
for (int i = 0; i < s.size(); ++i) {
if (s[i] == '(') {
cnt++;
} else if (s[i] == ')') {
if (cnt == 0) {
return false;
} else {
cnt--;
}
}
}
return cnt == 0;
}

vector<string> removeInvalidParentheses(string s) {
vector<string> res;
unordered_set<string> visited;
queue<string> q;
q.push(s);
visited.insert(s);
bool found = false;
while (!q.empty()) {
string str = q.front();
q.pop();
if (isValid(str)) {
res.push_back(str);
found = true;
}
if (found == true) {
continue;
}
for (int i = 0; i < str.size(); ++i) {
if (str[i] != '(' && str[i] != ')') {
continue;
}
string new_str = str.substr(0, i) + str.substr(i + 1);
if (visited.find(new_str) == visited.end()) {
q.push(new_str);
visited.insert(new_str);
}
}
}
return res;
}
};


### Leetcode 23. Merge k Sorted Lists

#### Posted on Oct 29, 2017

One can use the min heap to find the $k$ largest elements in array. See here. The time complexity is $O(k + (n - k)*\log k)$ compared to using max heap (default in C++) which has time complexity $O(n + k\log n)$.

Pay attention to the syntax that lambda expression cannot be used inside priority_queue template and one should use functor. This is different from sort. For sorting, there are a few different ways of changing default comparator, either by overloading operators or using lambda expression/functor to create custom operator (in C++, default is std:less, and default is max heap for priority queue, sorting order is increasing by fault). One can use functor for sort too but one should use cmp() instead of cmp.

Advantage of C++ over Python in interview is that one can easily use min-heap by

priority_queue<int, vector<int>, greater<int>> Q; 

For the question itself, there are two ways of solving this question. One is divide and conquer: merge 2 lists at one time, and do it from bottom-up approach. The other is use extra space to store first node for each linked-list. Both approaches have the same time complexity: $O(n \ k \ log(k))$, where $n$ is average number of nodes in a list and $k$ is number of lists.

Also pay attention to the difference of heap and priority_queue: see here. The one usually referred is called Binary Heap but there are many more variants of heaps and see this notes for time complexity comparisons link.


/**
* Definition for singly-linked list.
* struct ListNode {
*     int val;
*     ListNode *next;
*     ListNode(int x) : val(x), next(NULL) {}
* };
*/

// Divide and Conquer
class Solution {
public:
ListNode* mergeKLists(vector<ListNode*>& lists) {
if (lists.size() == 0) return NULL;
int n = lists.size();
while (n > 1) {
int k = (n + 1) / 2;
for (int i = 0; i < n / 2; ++i) {
lists[i] = mergeTwoLists(lists[i], lists[i + k]);
}
n = k;
}
return lists[0];
}

ListNode* mergeTwoLists(ListNode* l1, ListNode* l2) {
ListNode* head = new ListNode(-1);
ListNode* cur = head;
while (l1 && l2) {
if (l1->val < l2->val) {
cur->next = l1;
l1 = l1->next;
} else {
cur->next = l2;
l2 = l2->next;
}
cur = cur->next;
}
if (l1) cur->next = l1;
if (l2) cur->next = l2;
}
};

// priority queue
struct cmp {
bool operator () (ListNode* a, ListNode* b) {
return a->val > b->val;
}
};

struct great {
bool operator () (int a, int b) {
return a > b;
}
};

class Solution {
public:
ListNode* mergeKLists(vector<ListNode*>& lists) {
// first 10 lines are not related to question itself....
vector<int> m{2, 5, 2, 3, 4};
/*struct great {
bool operator () (int a, int b) {
return a > b;
}
}great;
sort(m.begin(), m.end(), great);*/
sort(m.begin(), m.end(), great());
for (int i = 0; i < m.size(); ++i) {
cout << m[i] << " ";
}

priority_queue<ListNode*, vector<ListNode*>, cmp> pq;
for (int i = 0; i < lists.size(); ++i) {
if (lists[i]) {
pq.push(lists[i]);
}
}
ListNode* head = nullptr;
ListNode* pre = nullptr;
while (!pq.empty()) {
ListNode* tmp = pq.top();
pq.pop();
}
else {
pre->next = tmp;
}
pre = tmp;
if (tmp->next) {
pq.push(tmp->next);
}

}
}
};


### Leetcode 15. 3Sum

#### Posted on Oct 29, 2017

If sorting is allowed, then the optimal solution with $O(n^2)$ time complexity is a big for loop, then a while loop with two pointers. Two get rid of duplicates, one need to write a few extra lines marked below.


class Solution {
public:
vector<vector<int>> threeSum(vector<int>& nums) {
vector<vector<int>> res;
sort(nums.begin(), nums.end());
for (int k = 0; k < nums.size(); ++k) {
if (nums[k] > 0) break;
// to avoid duplicates
if (k > 0 && nums[k] == nums[k - 1]) continue;
int target = 0 - nums[k];
int i = k + 1, j = nums.size() - 1;
while (i < j) {
if (nums[i] + nums[j] == target) {
res.push_back({nums[k], nums[i], nums[j]});
// to avoid duplicates
while (i < j && nums[i] == nums[i + 1]) ++i;
while (i < j && nums[j] == nums[j - 1]) --j;
++i; --j;
} else if (nums[i] + nums[j] < target) ++i;
else --j;
}
}
return res;
}
};

// If sorting of whole array is not allowed (common follow-up question for fb)
class Solution {
public:
vector<vector<int>> threeSum(vector<int>& nums) {
set<vector<int>> res;
unordered_map<int, int> freq;
for (int i = 0; i < nums.size(); ++i) {
freq[nums[i]]++;
}
for (int i = 0; i < nums.size() - 1; ++i) {
for (int j = i + 1; j < nums.size(); ++j) {
int target = 0 - nums[i] - nums[j];
if (freq.count(target)) {
int count = 0;
if (nums[i] == target) count++;
if (nums[j] == target) count++;
if (freq[target] > count) {
vector<int> tmp = {nums[i], nums[j], target};
sort(tmp.begin(), tmp.end());
res.insert(tmp);
}
}
}
}
return vector<vector<int>>(res.begin(), res.end());
}
};


### Leetcode 278. First Bad Version

#### Posted on Oct 29, 2017

Similar to the question below, one just need to figure out when is the predicate(mid). Here is obviously given by the API Call isBadVersion(mid). When one see the code, there is some similarity between binary search and Hoare-partition (which also has right = mid, left = mid + 1 in the recursion). Another thing to note is that when to return the mid, then right is nums.size() rather than nums.size() - 1 if one decides to use right = mid instead of right = mid - 1


// Forward declaration of isBadVersion API.

class Solution {
public:
int firstBadVersion(int n) {
int left = 1, right = n;
while (left < right) {
int mid = left + (right - left) / 2;
right = mid;
} else {
left = mid + 1;
}
}
return left;
}
};


### Leetcode 658. Find K Closest Elements

#### Posted on Oct 29, 2017

Read more on binary search here. The fact that input array is sorted and is an array suggests that we should use binary search which has time complexity $O(\log n)$. If the input is list (linked list), then we should use a different algorithm which deletes the left end or the right end (depending on which one is farther) until the remaining list is of length $k$ and then we can return the list as our result.


class Solution {
public:
vector<int> findClosestElements(vector<int>& arr, int k, int x) {
int left = 0, right = arr.size() - k;
while (left < right) {
int mid = left + (right - left) / 2;
// want first yes in a sequence of no's followed by yes's
// predicate of p(mid) == true
if (x - arr[mid] <= arr[mid + k] - x) {
right = mid;
} else {
left = mid + 1;
}
}
return vector<int>(arr.begin() + left, arr.begin() + left + k);
}
};


### Leetcode 121. Best Time to Buy and Sell Stock

#### Posted on Oct 29, 2017

Just need to use two variables to keep updating the current minimum buy price and updating current maximum potential profit.


class Solution {
public:
int maxProfit(vector<int>& prices) {
int res = 0, buy = INT_MAX;
for (int price : prices) {
res = max(res, price - buy);
}
return res;
}
};


### Leetcode 308. Range Sum Query 2D - Mutable

#### Posted on Oct 29, 2017

Another question has something to do with the trade-off of two operations: update and sumRegion. The implementation really depends on which API call is more frequent. If there are a lot of updates but very few sums, then we can simply allow sumRegion to be naive implementation with $O(n^2)$ time complexity. If there are a lot of sums but very few updates then we can maintain a matrix of cumulative sum of all elements with smaller indexes. So this way, sum operation is $O(1)$ but the update operation is $O(n^2)$. On the other hand, one can use Fenwick tree to have both operations of time complxity $O(\log n)$. The second solution is also a balanced way but with both operations of time complexity $O(n)$ but much easier to implement than 2D-Fenwick tree.


class NumMatrix {
public:
NumMatrix(vector<vector<int>> matrix) {
if (matrix.empty() || matrix[0].empty()) return;
mat.resize(matrix.size() + 1, vector<int>(matrix[0].size() + 1, 0));
bit.resize(matrix.size() + 1, vector<int>(matrix[0].size() + 1, 0));
for (int i = 0; i < matrix.size(); ++i) {
for (int j = 0; j < matrix[i].size(); ++j) {
update(i, j, matrix[i][j]);
}
}
}

void update(int row, int col, int val) {
int diff = val - mat[row + 1][col + 1];
for (int i = row + 1; i < mat.size(); i += i & -i) {
for (int j = col + 1; j < mat[i].size(); j += j & -j) {
bit[i][j] += diff;
}
}
mat[row + 1][col + 1] = val;
}

int sumRegion(int row1, int col1, int row2, int col2) {
return getSum(row2 + 1, col2 + 1) - getSum(row1, col2 + 1) - getSum(row2 + 1, col1) + getSum(row1, col1);
}

int getSum(int row, int col) {
int res = 0;
for (int i = row; i > 0; i -= i & -i) {
for (int j = col; j > 0; j -= j & -j) {
res += bit[i][j];
}
}
return res;
}

private:
vector<vector<int>> mat;
vector<vector<int>> bit;
};
/**
* Your NumMatrix object will be instantiated and called as such:
* NumMatrix obj = new NumMatrix(matrix);
* obj.update(row,col,val);
* int param_2 = obj.sumRegion(row1,col1,row2,col2);
*/

// Column Sum
class NumMatrix {
public:
NumMatrix(vector<vector<int>> matrix) {
if (matrix.empty() || matrix[0].empty()) return;
mat = matrix;
colSum.resize(matrix.size() + 1, vector<int>(matrix[0].size(), 0));
for (int i = 1; i < colSum.size(); ++i) {
for (int j = 0; j < colSum[0].size(); ++j) {
colSum[i][j] = colSum[i - 1][j] + matrix[i - 1][j];
}
}
}

void update(int row, int col, int val) {
for (int i = row + 1; i < colSum.size(); ++i) {
colSum[i][col] += val - mat[row][col];
}
mat[row][col] = val;
}

int sumRegion(int row1, int col1, int row2, int col2) {
int res = 0;
for (int j = col1; j <= col2; ++j) {
res += colSum[row2 + 1][j] - colSum[row1][j];
}
return res;
}

private:
vector<vector<int>> mat;
vector<vector<int>> colSum;
};



### Leetcode 56. Merge Intervals

#### Posted on Oct 27, 2017

The tricky part is to sort the intervals based on the left endpoint.


/**
* Definition for an interval.
* struct Interval {
*     int start;
*     int end;
*     Interval() : start(0), end(0) {}
*     Interval(int s, int e) : start(s), end(e) {}
* };
*/

struct cmp {
bool operator() (Interval& a, Interval& b) {
return (a.start < b.start);
}
};

class Solution {
public:
vector<Interval> merge(vector<Interval>& intervals) {
vector<Interval> res;
if (intervals.empty()) return res;
sort(intervals.begin(), intervals.end(), cmp());
res.push_back(intervals[0]);
for (int i = 1; i < intervals.size(); ++i) {
if (res.back().end >= intervals[i].start) {
int start = res.back().start, end = res.back().end;
res.pop_back();
res.push_back(Interval(start, max(end, intervals[i].end)));
} else {
res.push_back(intervals[i]);
}
}
return res;
}

};


### Leetcode 17. Letter Combinations of a Phone Number

#### Posted on Oct 27, 2017

Using recursion is much clearer than the codes I wrote in Python before. That is one point I want to convey: more lines is not a bad thing sometimes and it is actually a good thing if longer codes are self-explanatory and short ones are not.


class Solution {
public:
vector<string> letterCombinations(string digits) {
vector<string> res;
if (digits == "") return res;
vector<string> dict = {"abc", "def", "ghi", "jkl", "mno", "pqrs", "tuv", "wxyz"};
letterCombinations(digits, dict, 0, "", res);
return res;
}

void letterCombinations(string digits, vector<string>& dict,
int level, string s, vector<string>& res) {
if (level == digits.size()) {
res.push_back(s);
} else {
string str = dict[digits[level] - '2'];
//cout << str;
for (int i = 0; i < str.size(); ++i) {
s.push_back(str[i]);
letterCombinations(digits, dict, level + 1, s, res);
s.pop_back();
}
}
return;
}
};


### Leetcode 20. Valid Parentheses

#### Posted on Oct 27, 2017

Use of stack. Pay attention to that need to check if the stack is empty in the end. I am doing FB questions now and they seem quite simple but you need to be bug-free and be thorough in thoughts.


cclass Solution {
public:
bool isValid(string s) {
stack<char> parentheses;
for (int i = 0; i < s.size(); ++i) {
if (s[i] == '(' || s[i] == '[' || s[i] == '{') parentheses.push(s[i]);
else {
if (parentheses.empty()) return false;
if (s[i] == ')' && parentheses.top() != '(') return false;
if (s[i] == ']' && parentheses.top() != '[') return false;
if (s[i] == '}' && parentheses.top() != '{') return false;
parentheses.pop();
}
}
return parentheses.empty();
}
};


### Leetcode 1. Two Sum

#### Posted on Oct 27, 2017

This is like the first leetcode question I have ever done. Use of hashmap is easiest way. Actually look at OOD deisgn question based on two sum, it is not clear what implementation to do exactly. It is up to the requirements of the system. Does it have more adds or more finds?


class Solution {
public:
vector<int> twoSum(vector<int>& nums, int target) {
unordered_map<int,int> m;
vector<int> v{0 , 0};
for (int i = 0; i < nums.size(); ++i) {
if (m.find(nums[i]) == m.end()) {
m[target - nums[i]] = i;
}
else {
v[0] = i;
v[1] = m[nums[i]];
break;
}
}
return v;

}
};


### Leetcode 461. Hamming Distance

#### Posted on Oct 26, 2017

Mainly to test if one knows "Exclusive Or".


class Solution {
public:
int hammingDistance(int x, int y) {
int tmp = x ^ y;
int res = 0;
while (tmp) {
if (tmp % 2) res++;
tmp /= 2;
}
return res;
}
};


### Leetcode 283. Move Zeroes

#### Posted on Oct 26, 2017

I remember the second one is how I failed the last interview with fb.... A simple variation from the standard leetcode question with two pointers. But I had a hard time seeing the difference back then and made a lot of mistakes in the middle of coding.


class Solution {
public:
void moveZeroes(vector<int>& nums) {
// 0, 1, 0, 3, 12
// 1  0  0  3  12
//    l     i

// all elements before l are zeros
int l = 0;
for (int i = 0; i < nums.size(); ++i) {
if (nums[i]) {
swap(nums[i], nums[l++]);
}
}
return;
}

int moveZeroes(vector<int>& nums) {
// if all elements after nonzeros are not important, then no need to do swaps
int l = 0;
for (int i = 0; i < nums.size(); ++i) {
if (nums[i]) {
nums[l++] = nums[i];
}
}
// l is number of non-zeros
cout << l << endl;
return l;
}
};


### Leetcode 12. Integer to Roman

#### Posted on Oct 26, 2017

Greedy algorithm: substract as many as possible for all these numbers: 1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1 in this particular order. This only work for numbers up to 3999 for obvious reason.


class Solution {
public:
string intToRoman(int num) {
string res = "";
vector<int> val{1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1};
vector<string> str{"M", "CM", "D", "CD", "C", "XC", "L", "XL", "X", "IX", "V", "IV", "I"};
for (int i = 0; i < val.size(); ++i) {
while (num >= val[i]) {
num -= val[i];
res += str[i];
}
}
return res;
}
};


### Leetcode 13. Roman to Integer

#### Posted on Oct 26, 2017

Nothing too interesting. Just remember two cases: add up or substract the integer depending on the order of two adjacent Roman letters.


class Solution {
public:
int romanToInt(string s) {
unordered_map<char, int> m{
{'I', 1}, {'V', 5}, {'X', 10}, {'L', 50}, {'C', 100}, {'D', 500}, {'M', 1000}
};
int res = 0;
for (int i = 0; i < s.size(); ++i) {
int val = m[s[i]];
if (i == s.size() - 1 || m[s[i + 1]] <= m[s[i]]) res += val;
else res -= val;
}
return res;
}
};


### Leetcode 215. Kth Largest Element in an Array

#### Posted on Oct 26, 2017

This question is to test quickselect which has $O(n)$ time complexity on average, lower than sorting algorithms such as quickselect $O(n\log n)$ time complexity. However, quickselect is resembling quicksort in many ways: including the two partition schemes. One is known as Lomuto partition scheme and the other one is called Hoare partition scheme. The main difference is that in Lomuto partition scheme, one can make sure the partition element is at the index returned but for Hoare partition scheme this is no longer true. We don't know exactly where the partition element is at, except the fact that is not at the right side of the index returned. That is why for the recursion part of the algorithm, we can see that there is difference as well. One is using pos - 1 and the other is using pos; and the first one recursion ends when pos == k and second one ends when start == end. The two differences are very important. Otherwise you will get bug for examples like [2, 4, 1, 3] and 2. You can return 1 instead of correct answer 3. References I used are link1 and link2. Quicksort is slightly easier to implement than quickselect. One does not need to use $k$ and the recursion ends when start >= end. It is divide and conquer algorithm. Here is not the case, choosing one branch of recursion rather than going both ways. Talking about Quicksort, it does not work well for sorting linkedlists since it requires random access to elements in the list and in that case one should use Mergesort, and it can be easily implemented in-place. (In-place is usually difficult for MergeSort since one has to move the data in the array up by 1, see here) for more information.


class Solution {
public:
int findKthLargest(vector<int>& nums, int k) {
int left = 0, right = nums.size() - 1;
return quickSelect(nums, 0, nums.size() - 1, k - 1);
}

int quickSelect(vector<int>& nums,int start,int end, int k){
int pos = partition(nums, start, end);
if (pos == k) return nums[pos];
else if (pos > k) return quickSelect(nums, start, pos - 1, k);
else return quickSelect(nums, pos + 1, end, k);
}

// Lomuto partition scheme
int partition(vector<int>& nums,int l,int r){
int pivot = nums[r];
int i = l;
for(int j = l;j < r;++j){
if(nums[j] >= pivot) swap(nums[i++],nums[j]);
}
swap(nums[i],nums[r]);
return i;
}

int quickSelect2(vector<int>& nums,int start,int end, int k){
if (end == start) return nums[start];
int pos = partition2(nums, start, end);
if (pos >= k) return quickSelect(nums, start, pos, k);
else return quickSelect(nums, pos + 1, end, k);
}

// Hoare partition scheme (has 3 times expected swaps compared to Lomuto)
int partition2(vector<int>& nums,int l,int r){
int x = nums[l], i = l - 1, j = r + 1;
while (1) {
do  j--; while (nums[j] < x);
do  i++; while (nums[i] > x);
if  (i < j)
swap(nums[i], nums[j]);
else
return j;
}
}
};


### Leetcode 675. Cut Off Trees for Golf Event

#### Posted on Oct 23, 2017

The trick here is that since we have to go from smallest height to largest height, then why don't we store all required information about the trees and sort them according to the height? Then we are breaking the big traversal problem into a sequence of small traversal problems, which can either be done by BFS or DFS. But since here we need to record the minimum number of steps taken (recall the minimum depth question in binary tree: BFS is preferred over DFS, because DFS has to iterate the tree once even if the tree root has a leaf left child and this is not very efficient; while BFS is level traversal, so we do less work despite the same upper time complexity bound). Same reason here, we use BFS here and we can implement BFS iteratively using queue and use visited matrix to record which nodes have been traversed such that we don't have to go through again in that particular traversal. (Note that for the next traversal between two trees we are actually allowed to go through the same node and this is due to the existing obstacles). Another thing I did not realize in the beginning is that the starting point could be grass instead of trees and between trees we can traverse on grass.


class Solution {
public:
int cutOffTree(vector<vector<int>>& forest) {
if (forest.empty() || forest[0].empty()) return 0;
int m = forest.size(), n = forest[0].size();
vector<vector<int>> trees;
// get all the tree positions and sort based on height
// trees[i][0] is height. The default comparison of vector compare first element before other elements.
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
if (forest[i][j] > 1) trees.push_back({forest[i][j], i, j});
}
}
sort(trees.begin(), trees.end());
int ans = 0;
// accumulate all the paths
for (int i = 0, cur_row = 0, cur_col = 0; i < trees.size(); i++) {
int step = next_step(forest, cur_row, cur_col, trees[i][1], trees[i][2]);
// if next tree cannot be reached, step = -1;
if (step == -1) return -1;
ans += step;
cur_row = trees[i][1];
cur_col = trees[i][2];
}
return ans;
}
private:
// BFS to find shortest path to next tree; if cannot reach next tree, return -1
int next_step(vector<vector<int>>& forest, int sr, int sc, int er, int ec) {
if (sr == er && sc == ec) return 0;
int m = forest.size(), n = forest[0].size();
queue<pair<int, int>> myq;
myq.push({sr, sc});
vector<vector<int>> visited(m, vector<int>(n, 0));
visited[sr][sc] = 1;
int step = 0;
vector<int> dir = {-1, 0, 1, 0, -1};
while (!myq.empty()) {
step++;
int sz = myq.size();
for (int i = 0; i < sz; i++) {
int row = myq.front().first, col = myq.front().second;
myq.pop();
for (int i = 0; i < 4; i++) {
int r = row + dir[i], c = col + dir[i+1];
if (r < 0 || r >= m || c < 0 || c >= n || visited[r][c] == 1 || forest[r][c] == 0) continue;
if (r == er && c == ec) return step;
visited[r][c] = 1;
myq.push({r, c});
}
}
}
return -1;
}
};


### Leetcode 206. Reverse Linked List

#### Posted on Oct 19, 2017

I realize that it is very important to visualize things. For example, during interviews, one should be able to draw what I did below. That really helps you coding anyway if you don't use papers (maybe even better than paper/pen, whiteboard is my favorite though). Today when I explained it to my gf, I realized that all of a sudden that recursion can be thought of mathemtical induction. The most challenging part is to figure out how to use previous results from calling recursion function to get the results for this step. Also, finding the conditions when recursion ends is sometimes challenging.


// dummy   1   2   3    4
//
// dummy   2    1   3    4
//             cur  tmp
// dummy   3    2   1    4
//        tmp            cur
class Solution {
public:
ListNode* reverseList(ListNode* head) {
ListNode *dummy = new ListNode(-1);
ListNode *cur = head;
while (cur->next) {
ListNode *tmp = cur->next;
cur->next = tmp->next;
tmp->next = dummy->next;
dummy->next = tmp;
}
return dummy->next;
}
};

// 1   2   3    4
// p
// 1   4   3    2
// p   h        p->next
class Solution {
public:
ListNode* reverseList(ListNode* head) {
ListNode *p = head;
p->next->next = p;
p->next = nullptr;
}
};


### Leetcode 3. Longest Substring Without Repeating Characters

#### Posted on Oct 19, 2017

Think about the case the substring is very very long. So it is not efficient to use set. Also difficult to use dynamic programming without additional space. I think the solution below still belongs to dynamic programming but need to memorize two things: what is the left starting point; what is the latest index for each character (256 buckets should be enough).


class Solution {
public:
int lengthOfLongestSubstring(string s) {
int m[256] = {0};
int res = 0, left = 0;
for (int i = 0; i < s.size(); ++i) {
if (m[s[i]] == 0 || m[s[i]] < left) {
res = max(res, i - left + 1);
} else {
left = m[s[i]];
}
m[s[i]] = i + 1;
}
return res;
}
};


### Leetcode 2. Add Two Numbers

#### Posted on Oct 17, 2017

The trickly part is that you need to check if there is still carry after iterating both lists.


class Solution {
public:
ListNode* addTwoNumbers(ListNode* l1, ListNode* l2) {
ListNode *res = new ListNode(-1);
ListNode *cur = res;
int carry = 0;
while (l1 || l2) {
int n1 = l1 ? l1->val : 0;
int n2 = l2 ? l2->val : 0;
int sum = n1 + n2 + carry;
carry = sum / 10;
cur->next = new ListNode(sum % 10);
cur = cur->next;
if (l1) l1 = l1->next;
if (l2) l2 = l2->next;
}
if (carry) cur->next = new ListNode(carry);
return res->next;
}
};


### Leetcode 146. LRU Cache

#### Posted on Oct 17, 2017

Doing top leetcode questions for Amazon....LRU cache is one of most tested questions. It is very hard to implement in python but much easier to do it in C++, since linked-list in implemented in std library for C++. One need to remember to put the last accessed item (either get or put) into front of list. For put, one should also consider the corner case the capacity will exceed the preset size.


class LRUCache {
public:
LRUCache(int capacity) {
cap = capacity;
}

int get(int key) {
auto it = m.find(key);
if (it == m.end()) {
return -1;
} else {
l.splice(l.begin(), l, it->second);
return it->second->second;
}
}

void put(int key, int value) {
auto it = m.find(key);
if (it != m.end()) {
l.erase(it->second);
}
l.push_front(make_pair(key, value));
m[key] = l.begin();
if (m.size() > cap) {
int k = l.rbegin()->first;
l.pop_back();
m.erase(k);
}
}

private:
int cap;
list<pair<int, int>> l;
unordered_map<int, list<pair<int, int>>::iterator> m;
};

/**
* Your LRUCache object will be instantiated and called as such:
* LRUCache obj = new LRUCache(capacity);
* int param_1 = obj.get(key);
* obj.put(key,value);


### Leetcode 102. Binary Tree Level Order Traversal

#### Posted on Oct 16, 2017

Use queue for level-order while for preorder, postorder and inorder, we need to use stack.


class Solution {
public:
vector<vector<int>> levelOrder(TreeNode* root) {
vector<vector<int>> res;
if (root == nullptr) return res;
queue<TreeNode*> q;
q.push(root);
while (!q.empty()) {
vector<int> onelevel;
int size = q.size();
for (int i = 0; i < size; ++i) {
TreeNode *node = q.front();
q.pop();
onelevel.push_back(node->val);
if (node->left) {
q.push(node->left);
}
if (node->right) {
q.push(node->right);
}
}
res.push_back(onelevel);
}
return res;
}
};


### Leetcode 145. Binary Tree Postorder Traversal

#### Posted on Oct 16, 2017

Posterorder is probably the most complicated out of three (or four, if level traversal is included).


class Solution {
public:
vector<int> postorderTraversal(TreeNode* root) {
vector<int> res;
stack<TreeNode*> s;
TreeNode *p = root;
TreeNode *prev = nullptr;
while (p || !s.empty()) {
while (p) {
s.push(p);
p = p->left;
}
TreeNode *node = s.top();
if (node->right == nullptr || node->right == prev) {
s.pop();
res.push_back(node->val);
prev = node;
} else {
p = node->right;
}
}
return res;
}
};


### Leetcode 144. Binary Tree Preorder Traversal

#### Posted on Oct 16, 2017

Just want to review the most important questions for trees.


/**
* Definition for a binary tree node.
* struct TreeNode {
*     int val;
*     TreeNode *left;
*     TreeNode *right;
*     TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
class Solution {
public:
vector<int> preorderTraversal(TreeNode* root) {
vector<int> res;
stack<TreeNode*> s;
if (!root) {
return res;
}
s.push(root);
while (!s.empty()) {
TreeNode* p = s.top();
s.pop();
res.push_back(p->val);
if (p->right) {
s.push(p->right);
}
if (p->left) {
s.push(p->left);
}
}
return res;
}
};


### Leetcode 94. Binary Tree Inorder Traversal

#### Posted on Oct 12, 2017

Inorder is similiar to preorder: we can do it both in recursion and iteratively. For iterative, the main difference is that we have insert differently, we need to insert left node as many as possible, then add middle to res, and right in the end.


class Solution {
public:
vector<int> inorderTraversal(TreeNode* root) {
vector<int> res;
if (!root) {
return res;
}
inorderTraversal(root, res);
return res;
}

void inorderTraversal(TreeNode* root, vector<int>& res) {
if (root->left) {
inorderTraversal(root->left, res);
}
res.push_back(root->val);
if (root->right) {
inorderTraversal(root->right, res);
}
}
};

// Non-recursion
class Solution {
public:
vector<int> inorderTraversal(TreeNode* root) {
vector<int> res;
stack<TreeNode*> s;
TreeNode *p = root;
while (p || !s.empty()) {
while (p) {
s.push(p);
p = p->left;
}
p = s.top();
s.pop();
res.push_back(p->val);
p = p->right;
}
return res;
}
};


### Leetcode 42. Trapping Rain Water

#### Posted on Oct 12, 2017

I think the solution is very smart, taking advantage of the fact that water will be trapped if the height is lower than left and right wall. and then moving the lower one of the two walls to update the res until the wall height is increased. Then one can consider the new left and right wall as the new problem and solve again using same algorithm.


class Solution {
public:
int trap(vector<int>& height) {
int res = 0, left = 0, right = height.size() - 1;
while (left < right) {
int mn = min(height[left], height[right]);
if (mn == height[left]) {
++left;
while (left < right && height[left] < mn) {
res += (mn - height[left]);
++left;
}
} else {
--right;
while (right > left && height[right] < mn) {
res += (mn - height[right]);
--right;
}
}
}
return res;
}
};



### Leetcode 84. Largest Rectangle in Histogram

#### Posted on Oct 12, 2017

Two solutions use the same idea that when the rectangle height is increasing, there is no need to update res, we only need to do so when we encounter a lower one. But the first one still can traversal more than $n$ due to the for loop. (Consider the example case when $i = 5$). The second one, to the contrary, will skip all the uncessary steps, will only update res exactly $n$ times. One can print out to see the difference of two methods.


class Solution {
public:
int largestRectangleArea(vector<int>& height) {
int res = 0;
for (int i = 0; i < height.size(); ++i) {
if (i + 1 < height.size() && height[i] <= height[i + 1]) {
continue;
}
int minH = height[i];
for (int j = i; j >= 0; --j) {
minH = min(minH, height[j]);
int area = minH * (i - j + 1);
res = max(res, area);
//cout << i << " " << j << " " << res << endl;
}
}
return res;
}
};

class Solution {
public:
int largestRectangleArea(vector<int>& heights) {
int res = 0;
stack<int> s;
heights.push_back(0);
for (int i = 0; i < heights.size(); ++i) {
while (!s.empty() && heights[s.top()] > heights[i]) {
int j = s.top();
s.pop();
res = max(res, heights[j] * (s.empty() ? i : (i - s.top() - 1)));
//cout << i << " " << j << " " <<
//    heights[j] * (s.empty() ? i : (i - s.top() - 1)) << endl;
}
s.push(i);
}
return res;
}
};



### Leetcode 238. Product of Array Except Self

#### Posted on Oct 11, 2017

I actually encountered this question last year in the first round of technical interview with Google. No wonder that I screwed it up. I still remember how bad I feel after the interviewer questioned that why I still have no paper published at my third year of PhD. Basically, one cannot use divide and therefore should maintain a copy of product of all numbers to the left and then go from right to left. This way, unlike the second approach, we do not need to use extra space but the space of storing the results.


class Solution {
public:
vector<int> productExceptSelf(vector<int>& nums) {
vector<int> res(nums.size(), 1);
for (int i = 1; i < nums.size(); ++i) {
res[i] = res[i - 1] * nums[i - 1];
}
int right = 1;
for (int i = nums.size() - 1; i >= 0; --i) {
res[i] *= right;
right *= nums[i];
}
return res;
}

vector<int> productExceptSelf(vector<int>& nums) {
int n = nums.size();
vector<int> fwd(n, 1), bwd(n, 1), res(n);
for (int i = 0; i < n - 1; ++i) {
fwd[i + 1] = fwd[i] * nums[i];
}
for (int i = n - 1; i > 0; --i) {
bwd[i - 1] = bwd[i] * nums[i];
}
for (int i = 0; i < n; ++i) {
res[i] = fwd[i] * bwd[i];
}
return res;
}
};


### Leetcode 208. Implement Trie (Prefix Tree)

#### Posted on Oct 10, 2017

Just want to remind for data strucure of prefix tree


class TrieNode {
public:
// Initialize your data structure here.
TrieNode* child[26];
bool isWord;
TrieNode() : isWord(false){
//for (int i = 0; i < 26; ++i) {
//    child[i] = NULL;
//}
for (auto &a : child) a = NULL;
}
};

class Trie {
public:
Trie() {
root = new TrieNode();
}

// Inserts a word into the trie.
void insert(string s) {
TrieNode *p = root;
for (auto &a : s) {
int i = a - 'a';
if (!p->child[i]) p->child[i] = new TrieNode();
p = p->child[i];
}
p->isWord = true;
}

// Returns if the word is in the trie.
bool search(string key) {
TrieNode *p = root;
for (auto &a : key) {
int i = a - 'a';
if (!p->child[i]) return false;
p = p->child[i];
}
return p->isWord;
}

// Returns if there is any word in the trie
// that starts with the given prefix.
bool startsWith(string prefix) {
TrieNode *p = root;
for (auto &a : prefix) {
int i = a - 'a';
if (!p->child[i]) return false;
p = p->child[i];
}
return true;
}

private:
TrieNode* root;
};

/**
* Your Trie object will be instantiated and called as such:
* Trie obj = new Trie();
* obj.insert(word);
* bool param_2 = obj.search(word);
* bool param_3 = obj.startsWith(prefix);
*/


### Leetcode 336. Palindrome Pairs

#### Posted on Oct 09, 2017

The use of hashtable is quite straightforward. But the use of set here is quite smart. Instead of unordered_set, set keeps the order of length of each word in the set, therefore we only need to iterate the iterator from s.begin() to s.find(len_w), since we only need to consider word of length less than current word. Other than the use of set, the other part worth paying attension is that there are three different cases: 'abd|dba', 'abdd|ba', 'ab|ddba' for current word of 'abd','abdd' and 'ddba'.


class Solution {
public:
vector<vector<int>> palindromePairs(vector& words) {
vector<vector<int>> res;
unordered_map<string, int> m;
set<int> s;
for (int i = 0; i < words.size(); ++i) {
m[words[i]] = i;
s.insert(words[i].size());
}
for (int i = 0; i < words.size(); ++i) {
string word = words[i];
reverse(word.begin(), word.end());
if (m.find(word) != m.end() && m[word] != i) {
res.push_back({i, m[word]});
}
int len_w = word.size();
for (auto it = s.begin(); it != s.find(len_w); ++it) {
// len_it is the length of candidate string
int len_it = *it;
// deal with left substring of reversed word is valid palindrome
if (isValid(word, 0, len_w - len_it - 1) && m.count(word.substr(len_w - len_it))) {
res.push_back({i, m[word.substr(len_w - len_it)]});
}
// deal with right substring of reversed word is valid palindrome
if (isValid(word, len_it, len_w - 1) && m.count(word.substr(0, len_it))) {
res.push_back({m[word.substr(0, len_it)], i});
}
}
}
return res;
}

bool isValid(string str, int left, int right) {
while (left < right) {
if (str[left++] != str[right--]) {
return false;
}
}
return true;
}
};


### Leetcode 680. Valid Palindrome II

#### Posted on Oct 09, 2017

It is an easy question but I want to emphasize that writing an additional helper function isValid makes the main function much easier to read and I belive readibility is very very important when writing C++ codes. (I don't know why but it seems that I made less stupid mistakes when I write in C++ compared to in Python, is it due to the type systems?).


class Solution {
public:
bool validPalindrome(string s) {
int left = 0;
int right = s.size() - 1;
while (left < right) {
if (s[left] != s[right]) {
return isValid(s, left + 1, right) || isValid(s, left, right - 1);
} else {
++left;
--right;
}
}
return true;
}

bool isValid(string s, int left, int right) {
while (left < right) {
if (s[left] != s[right]) {
return false;
} else {
++left;
--right;
}
}
return true;
}
};


### Leetcode 651. 4 Keys Keyboard

#### Posted on Oct 09, 2017

This question is also about using dynamic programming. Similar idea as previous question.


class Solution {
public:
int maxA(int N) {
vector<int> dp(N + 1, 0);
for (int i = 1; i < N + 1; ++i) {
dp[i] = i;
for (int j = 1; j < i; ++j) {
dp[i] = max(dp[i], dp[j] * (i - j - 1));
}
}
return dp[N];
}
};


### Leetcode 650. 2 Keys Keyboard

#### Posted on Oct 09, 2017

This question is about using dynamic programming. The most important thing for dp is that one needs to find the recursive relationships. Here, the recursive equation is dp[i] = min(dp[i], dp[j] + i / j) when i % j == 0. The reason for this, when j is a divisor of i, we can simply copy (1 time) and paste (i / j - 1 times), and in total i / j times to obtain the final result. One can also use recursiion to solve dp questions most of time but lead to a suboptimal time complexity.


class Solution {
public:
int minSteps(int n) {
vector<int> dp(n + 1, 0);
for (int i = 2; i < n + 1; ++i) {
dp[i] = i;
for (int j = 2; j < i; ++j) {
if (i % j == 0) {
dp[i] = min(dp[i], dp[j] + i / j);
}
}
}
return dp[n];
}
};


### Leetcode 291. Word Pattern II

#### Posted on Oct 05, 2017

This question is much harder than 290. Word Pattern despite the only difference it removes the space between words. What makes it hard is that now we don't know where does one word, where does one end. In this kind of scenario, one should use backtracking. I used set for the same reason as previous question. The beginning of helper function is what I called "ending conditions" and recursion logic is buried in the foor loop (there are two separate cases for each potential word: either in hashtable or not). One should deal with those two cases separately.


class Solution {
public:
bool wordPatternMatch(string pattern, string str) {
unordered_map<char, string> m;
set<string> s;
return helper(pattern, 0, str, 0, m, s);
}

bool helper(string& pattern, int p1, string& str, int p2,
unordered_map<char, string>& m, set<string>& s) {
if (p1 == pattern.size() && p2 == str.size()) {
return true;
}
if (p1 == pattern.size() || p2 == str.size()) {
return false;
}
char c = pattern[p1];
for (int i = p2; i < str.size(); ++i) {
string word = str.substr(p2, i - p2 + 1);
if (m.find(c) != m.end()) {
if (m[c] == word) {
return helper(pattern, p1 + 1, str, i + 1, m, s);
}
} else {
if (s.find(word) != s.end()) {
continue;
}
m[c] = word;
s.insert(word);
if (helper(pattern, p1 + 1, str, i + 1, m, s)) {
return true;
}
m.erase(c);
s.erase(word);
}
}
return false;
}
};


### Leetcode 290. Word Pattern

#### Posted on Oct 05, 2017

Just realized there is no need to use two unordered_map. Can simply use one. Well, it is a trade-off between space and time.


class Solution {
public:
bool wordPattern(string pattern, string str) {
unordered_map<char, string> m;
istringstream in(str);
int i = 0;
for (string word; in >> word; ++i) {
char c = pattern[i];
if (m.find(c) != m.end()) {
if (m[c] != word)
return false;
} else {
for (auto it = m.begin(); it != m.end(); ++it) {
if (it->second == word)
return false;
}
m[c] = word;
}
}
return i == pattern.size();
}
};

class Solution {
public:
bool wordPattern(string pattern, string str) {
unordered_map<char, string> m;
// use set to avoid traversing the m, but uses more space
set<string> s;
istringstream in(str);
int i = 0;
for (string word; in >> word; ++i) {
char c = pattern[i];
if (m.find(c) != m.end()) {
if (m[c] != word)
return false;
} else {
/*for (auto it = m.begin(); it != m.end(); ++it) {
if (it->second == word)
return false;
}*/
if (s.count(word)) {
return false;
}
m[c] = word;
s.insert(word);
}
}
return i == pattern.size();
}
};


### Leetcode 362. Design Hit Counter

#### Posted on Oct 05, 2017

I think the main difficult lies in the follow-up questions: What if the number of hits per second could be very large? Does your design scale?

I think in a real interview, that is the kind of questions you should bring it up to the interview before you are asked. It is quite likely that there are thousands if not millions of hits each second. One may wish to use long or long long instead of int below to avoid overflow issues. since $2^{32} - 1 = 4,294,967,295$, it is not that a large number. Overflow could happen when we don't have 300 seconds but a lot more granularities. maybe 300 * 60 ms. Also try not use 300 directly in the codes, define a constant number instead. The algorithm side, using a bucket of 300 is quick smart, definitely smarter than using a queue and returning the size of queue as the result. Since the size of queue could be very large and that is simply a waste of memory when you don't need to store the exact timestamp (same here) many many times. It could easily lead to memory issues when scaling up. Therefore, one should never do that in practice. Instead the requirement says that you need to keep the history, then you should write it to a database/disk for every second query results. I think more likely case is that not only hits are useful but also the hit comes with other useful information and those information usually is written to database. One may wish to access those later. (a need to record the hits id within past 300 seconds). One just need to add a vector<vector<int>> to store those.


class HitCounter {
public:
/** Initialize your data structure here. */
HitCounter() {
times.resize(300, 0);
hits.resize(300, 0);
}

/** Record a hit.
@param timestamp - The current timestamp (in seconds granularity). */
void hit(int timestamp) {
int idx = timestamp % 300;
if (times[idx] == timestamp) {
hits[idx]++;
} else {
times[idx] = timestamp;
hits[idx] = 1;
}
return;
}

/** Return the number of hits in the past 5 minutes.
@param timestamp - The current timestamp (in seconds granularity). */
int getHits(int timestamp) {
int res = 0;
for (int i = 0; i < NUM_BUCKETS; ++i) {
// the line below is not needed of there are hits every second
if (timestamp - times[i] < 300) {
res += hits[i];
}
}
return res;
}

private:
int NUM_BUCKETS = 300;
// record timestamp for past 300 seconds
vector<int> times;
// record how many hits for past 300 seconds
vector<int> hits;

};

/**
* Your HitCounter object will be instantiated and called as such:
* HitCounter obj = new HitCounter();
* obj.hit(timestamp);
* int param_2 = obj.getHits(timestamp);
*/


### Leetcode 609. Find Duplicate File in System

#### Posted on Oct 05, 2017

Use of Hashmap. I feel whenever one is dealing with string, it is much easier to do so using Python. Here, we don't have to use dfs or bfs, since the paths have been given. Those follow up questions are quite interesting:

• 1. Imagine you are given a real file system, how will you search files? DFS or BFS?
• 2. If the file content is very large (GB level), how will you modify your solution?
• 3. If you can only read the file by 1kb each time, how will you modify your solution?
• 4. What is the time complexity of your modified solution? What is the most time-consuming part and memory consuming part of it? How to optimize?
• 5. How to make sure the duplicated files you find are not false positive?
My understandings are:
• 1. In real life, one may want to use dfs over bfs when the search is deep, since bfs uses much more memory than dfs. BFS may have better cache performance, due to locality of files on disk. But if you know the files you are searching for is close to the node you are currently at, then BFS may be performing better.
• 2. It is not possible to hash the entire contents as what the codes did below. Instead, one should first map files accoring to file size, for the files of same size, hash a small part of file using MD5 for example. Only when all those are the same, we will compare byte by byte.
• 3. We can still hash 1k chunk similarly.
• 4. Worst time complexity is $O(n^2 * k)$, when all files have the same size k bytes, and we have to compare every two files byte by byte.
• 5. file size, hash, byte by byte as last sanity check.


class Solution {
public:
vector<vector<string>> findDuplicate(vector<string>& paths) {
unordered_map<string, vector<string>> files;
vector<vector<string>> res;
for (auto path : paths) {
stringstream ss(path);
string root;
string s;
getline(ss, root, ' ');
while (getline(ss, s, ' ')) {
string filename = root + '/' + s.substr(0, s.find('('));
string filecontent = s.substr(s.find('(') + 1, s.find(')') - s.find('(') - 1);
files[filecontent].push_back(filename);
}
}

for (auto it = files.begin(); it != files.end(); ++it) {
if ((it->second).size() > 1) {
res.push_back(it->second);
}
}
return res;
}
};


### Leetcode 549. Binary Tree Longest Consecutive Sequence II

#### Posted on Sep 18, 2017

Recursion for Binary Tree.


class Solution {
public:
int longestConsecutive(TreeNode* root) {
if (!root) {
return 0;
}
int res = 0;
res = helper(root, 1) + helper(root, -1) + 1;
return max(res, max(longestConsecutive(root->left), longestConsecutive(root->right)));
}

private:
int helper(TreeNode* root, int diff) {
if (!root) {
return 0;
}
int left = 0, right = 0;
if (root->left && root->left->val == root->val + diff) {
left = 1 + helper(root->left, diff);
}
if (root->right && root->right->val == root->val + diff) {
right = 1 + helper(root->right, diff);
}
return max(left, right);
}
};


### Leetcode 298. Binary Tree Longest Consecutive Sequence

#### Posted on Sep 18, 2017

Recursion for Binary Tree.


/**
* Definition for a binary tree node.
* struct TreeNode {
*     int val;
*     TreeNode *left;
*     TreeNode *right;
*     TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/

class Solution {
public:
int longestConsecutive(TreeNode* root) {
if (!root) return 0;
int res = 0;
dfs(root, 1, res);
return res;
}

private:
void dfs(TreeNode* root, int len, int& res) {
res = max(res, len);
if (root->left) {
if (root->left->val == root->val + 1) {
dfs(root->left, len + 1, res);
} else {
dfs(root->left, 1, res);
}
}
if (root->right) {
if (root->right->val == root->val + 1) {
dfs(root->right, len + 1, res);
} else {
dfs(root->right, 1, res);
}
}
}
};

class Solution {
public:
int longestConsecutive(TreeNode* root) {
return helper(root, nullptr, 0);
}

private:
int helper(TreeNode* root, TreeNode* p, int res) {
if (!root) return res;
res = (p && root->val == p->val + 1) ? res + 1 : 1;
return max(res, max(helper(root->left, root, res), helper(root->right, root, res)));
}
};


### Leetcode 296. Best Meeting Point

#### Posted on Sep 18, 2017

Consider one dimension first. Then one will realize the median of an odd sequence or any value following in between middle two values of an even sequence are the results. Similarly, for higher dimensions, we can consider each dimension one by one.


class Solution {
public:
int minTotalDistance(vector<vector<int>>& grid) {
vector<int> rows, cols;
for (int i = 0; i < grid.size(); ++i) {
for (int j = 0; j < grid[0].size(); ++j) {
if (grid[i][j] == 1) {
rows.push_back(i);
cols.push_back(j);
}
}
}
return minTotalDistance(rows) + minTotalDistance(cols);
}

private:
int minTotalDistance(vector<int>& v) {
sort(v.begin(), v.end());
int res = 0, i = 0, j = v.size() - 1;
while (i < j) {
res += (v[j--] - v[i++]);
}
return res;
}
};


### Leetcode 317. Shortest Distance from All Buildings

#### Posted on Sep 18, 2017

BFS using queue. We need to do BFS every time we have a building to create a distance matrix for that building. Then we have a cumulative matrix of distances for all buildings. We just need to find the minimum number in that cumulative matrix. Also, need visited bool matrix to avoid adding same node twice in the queue and a cnt matrix to count how many buildings can be reached.


class Solution {
public:
int shortestDistance(vector<vector<int>>& grid) {
int buildingCnt = 0, m = grid.size(), n = grid[0].size();
vector<vector<int>> dist(m, vector<int>(n, 0)), cnt = dist;
vector<vector<int>> dirs{{0, -1}, {0, 1}, {-1, 0}, {1, 0}};
for (int i = 0; i < m; ++i) {
for (int j = 0; j < n; ++j) {
if (grid[i][j] == 1) {
++buildingCnt;
queue<pair<int, int>> q;
q.push({i, j});
int level = 1;
vector<vector<bool>> visited(m, vector<bool>(n, false));
while (!q.empty()) {
int size = q.size();
for (int s = 0; s < size; ++s) {
int a = q.front().first, b = q.front().second;
q.pop();
for (auto dir : dirs){
int x = a + dir[0], y = b + dir[1];
if (x >= 0 && x < m && y >= 0 && y < n
&& grid[x][y] == 0 && visited[x][y] == false) {
visited[x][y] = true;
dist[x][y] += level;
cnt[x][y] += 1;
q.push({x, y});
}
}
}
++level;
}
}
}
}
int res = INT_MAX;
for (int i = 0; i < m; ++i) {
for (int j = 0; j < n; ++j) {
if (grid[i][j] == 0 && cnt[i][j] == buildingCnt) {
res = min(res, dist[i][j]);
}
}
}
return (res == INT_MAX) ? -1 : res;
}
};


### Leetcode 542. 01 Matrix

#### Posted on Sep 17, 2017

BFS using queue. Starting from any 0 point. The reason that we do not need additional visisted boolean matrix here is that we have the condition that matrix[x][y] > matrix[t.first][t.second] to prevent one node being added twice.


class Solution {
public:
vector<vector<int>> updateMatrix(vector<vector<int>>& matrix) {
int m = matrix.size(), n = matrix[0].size();
vector<vector<int>> dirs{{1, 0}, {-1, 0}, {0, 1}, {0, -1}};
queue<pair<int, int>> q;
for (int i = 0; i < m; ++i) {
for (int j = 0; j < n; ++j) {
if (matrix[i][j] == 0) {
q.push({i, j});
} else {
matrix[i][j] = INT_MAX;
}
}
}
while (q.empty() == false) {
auto t = q.front();
q.pop();
for (auto dir : dirs) {
int x = t.first + dir[0], y = t.second + dir[1];
if (x >= 0 && x < m && y >= 0 && y < n) {
if (matrix[x][y] > matrix[t.first][t.second]) {
matrix[x][y] = matrix[t.first][t.second] + 1;
q.push({x, y});
}
}
}
}
return matrix;
}
};


### Leetcode 479. Largest Palindrome Product

#### Posted on Sep 17, 2017

makePalindrome() function to create a palindrome number in a traditional way without using string. Moreover, it supports to create palindrome of odd length and of even length. With left part of palindrome number (including middle digit for odd length) equals to seed $x$. In pratice, what I found is that it always returns in the first for loop in largestPalindrome(), not sure why....maybe the existence can be proved. Also, for even $n$, we can further find the trick of generating the palindrome number by performing when $n = 2$, $(10^2+1)*(10^2-1)-(10^1*(10^2-1)) = 9009$ and when $n = 6$, get $(10^6+1)*(10^6-1)-(10^3*(10^6-1)) = 999000000999$， instead of largest candidate $9999$ and $999999999999$. So for even number, we can get the results easily. Also, here $n$ cannot be too large, otherwise, there will be overflow issues for long (32 bits).


class Solution {
public:
long makePalindrome(int x, bool odd) {
long p = x;

if (odd)
x /= 10;

while (x > 0) {
p = p * 10 + x % 10;
x /= 10;
}

return p;
}

bool isProductOfNumsInRange(long x, int min, int max)
{
int d_max = x / min;
if (d_max > max)
d_max = max;

for (long d = d_max; d * d >= x; d--)
if (x % d == 0) {
cout << d << " " << x / d << endl;
return true;
}

return false;
}

int largestPalindrome(int n) {
int max = pow(10, n) - 1;
int min = pow(10, n - 1);

for (int x = max; x > 0; x--) {
long p = makePalindrome(x, false);
if (isProductOfNumsInRange(p, min, max)) {
cout << p;
return p % 1337;
}
}

for (int x = max; x > 0; x--) {
long p = makePalindrome(x, true);
if (isProductOfNumsInRange(p, min, max))
return p % 1337;
}

return 0;
}
};


### Leetcode 393. UTF-8 Validation

#### Posted on Sep 17, 2017

First, keep in mind that 0b used to represent binary literal and 0x for hexadecimal, 0 for octal. See here. To know more about utf-8 and relationship with ASCII and extended-ASCII, read wikipedia.


class Solution {
public:
bool validUtf8(vector<int>& data) {
int cnt = 0;
for (int d : data) {
if (cnt == 0) {
if ((d >> 5) == 0b110) {
cnt = 1;
} else if ((d >> 4) == 0b1110) {
cnt = 2;
} else if ((d >> 3) == 0b11110) {
cnt = 3;
} else if ((d >> 7) == 0b0) {
cnt = 0;
} else {
return false;
}
} else {
if ((d >> 6) == 0b10) {
--cnt;
} else {
return false;
}
}
}
return cnt == 0;
}
};


### Leetcode 657. Judge Route Circle

#### Posted on Sep 17, 2017

It seems the question already assumes each character has to be one of 'R', 'L', 'U', 'D' and using unordered_map is slowerer than using 4 variables due to the overhead. One can simply use 4 variables each corresponding to one of the moves and check the same criteria. Well, in that case, one can even use two variables just representing the difference of each pair of moves. One variable does not work all the time, since decoding $R = 2, L = -2, U = 3, D = -3$, then $3 * R + 2 * D = 0$, although they should not be so. It only works if lowest common multiple: $x * R = y * U$ and $x > n, y > n$, where n is the length of string. Only satisfying this criteria, one can use one global variable.


class Solution {
public:
bool judgeCircle(string moves) {
unordered_map<char, int> m ({
{'R', 0}, {'L', 0}, {'U', 0}, {'D', 0}
});
for (int i = 0; i < moves.size(); ++i) {
char c = moves[i];
if (m.find(c) == m.end()) {
return false;
} else {
m[c] += 1;
}
}
if (m['R'] == m['L'] && m['U'] == m['D']) {
return true;
} else {
return false;
}
}
};


### Leetcode 81. Search in Rotated Sorted Array II

#### Posted on Sep 8, 2017

The main difference this question and previous question is that this may contain duplicates. Therefore, one should consider cases when duplicate affect the decision whether left or right subsequence is increasing. Consider $[3, 1, 1]$ and $[1, 1, 3, 1]$, one can see that it is impossible to tell when $nums[mid] = nums[right]$, but we do know that nums[right] is not the target we want, therefore, we can simply by removing the right-most element and trim into a smaller subsequence which has a deterministic relationship between nums[mid] and nums[right] without affecting the final answer. If every element is exactly the same, then we have $O(n)$ time complexity instead of $O(log(n))$ and this is the best we can do.


class Solution {
public:
bool search(vector<int> nums, int target) {
int n = nums.size();
if (n == 0) return false;
int left = 0, right = n - 1;
while (left <= right) {
int mid = left + (right - left) / 2;
if (nums[mid] == target) {
return true;
}
// consider [3, 1, 1] and [1, 1, 3, 1]
if (nums[mid] == nums[right]) {
--right;
} else if (nums[mid] > nums[right]) {
// [4, 5, 6, 7, -1, 0, 1, 2]
// left side is increasing
if (nums[left] <= target && target < nums[mid]) {
right = mid - 1;
} else {
left = mid + 1;
}
} else {
// [4, 5, 6, -2, -1, 0, 1, 2]
// right side is increasing
if (nums[mid] < target && target <= nums[right]) {
left = mid + 1;
} else {
right = mid - 1;
}
}
}
return false;
}
};


### Leetcode 33. Search in Rotated Sorted Array

#### Posted on Sep 8, 2017

if value at mid is less than value at right, then it means the right subsequence is increasing and we want to check if target is in this subsequence or not. If not, we go to left hand side. If value at mid is greater than the value at right, that means we have the left subsequence increasing. We should have different rules of checking if target value falls left or right hand side for those two scenarios.


class Solution {
public:
int search(vector<int>& nums, int target) {
if (nums.empty()) return -1;
if (nums.size() == 1) return (nums[0] == target) ? 0 : -1;
int left = 0, right = nums.size() - 1;
while (left <= right) {
int mid = left + (right - left) / 2;
if (nums[mid] == target) {
return mid;
}
if (nums[mid] < nums[right]) {
if (nums[mid] < target && target <= nums[right]) {
left = mid + 1;
} else {
right = mid - 1;
}
} else {
if (nums[left] <= target && target < nums[mid]) {
right = mid - 1;
} else{
left = mid + 1;
}
}
}
return -1;
}
};


### Leetcode 294. Flip Game II

#### Posted on Sep 4, 2017

Pay attention: return true when there is a way to win (can win), and return false, when it must lose. Use private variable to speed-up (memorization).


class Solution {
private:
unordered_map<string, bool> m;

public:
bool canWin(string s) {
if (s.size() <= 1) {
return false;
}
if (m.find(s) != m.end()) {
return m[s];
}
for (int i = 0; i < s.size() - 1; ++i) {
if (s[i] == '+' && s[i + 1] == '+') {
s[i] = s[i + 1] = '-';
if (canWin(s)){
m[s] = true;
} else {
return true;
}
s[i] = s[i + 1] = '+';
}
}
m[s] = false;
return false;
}
};


### Leetcode 293. Flip Game

#### Posted on Sep 4, 2017

Very Straightforward. Use substr or simply change in place.


class Solution {
public:
vector<string> generatePossibleNextMoves(string s) {
vector<string> res;
int n = s.size();
if (n <= 1) return res;
for (int i = 0; i < n - 1; ++i) {
if (s[i] == '+' && s[i + 1] == '+') {
s[i] = '-';
s[i + 1] = '-';
//res.push_back(s.substr(0, i) + "--" + s.substr(i + 2));
res.push_back(s);
s[i] = '+';
s[i + 1] = '+';
}
}
return res;
}
};


### Leetcode 482. License Key Formatting

#### Posted on Sep 4, 2017

Very Straightforward. Even easier in Python.


class Solution {
public:
string licenseKeyFormatting(string S, int K) {
string res = "";
int cnt = 0;
int n = S.size();
for (int i = n - 1; i >= 0; --i) {
char c = toupper(S[i]);
if (c == '-') {
continue;
}
// use following when toupper() is not allowed
//if (c >= 'a' && c <= 'z') c -= 32;
res.push_back(c);
if (++cnt % K == 0) {
res.push_back('-');
}
}
if (!res.empty() && res.back() == '-') {
res.pop_back();
}
return string(res.rbegin(), res.rend());
}
};


### Leetcode 577. Employee Bonus

#### Posted on Sep 4, 2017

To recall the difference between on and where, see link and see here for the difference between inner_join and outer_join (left, right).


SELECT Employee.name as name, Bonus.bonus as bonus
FROM Employee LEFT OUTER JOIN Bonus
ON Employee.empId = Bonus.empId
WHERE bonus < 1000 OR bonus IS NULL;


### Leetcode 597. Friend Requests I: Overall Acceptance Rate

#### Posted on Sep 4, 2017

Use ifnull to deal with null value and use round to round to required digits, use as to change name


SELECT ROUND(IFNULL(

(SELECT COUNT(DISTINCT requester_id, accepter_id) FROM request_accepted) /

(SELECT COUNT(DISTINCT sender_id, send_to_id) FROM friend_request )

,0)  ,2)   AS accept_rate


### Leetcode 670. Maximum Swap

#### Posted on Sep 4, 2017

Idea is to store the index of last occurence for each digit and then at each step try to switch with a larger digit which has a larger last occurence index. It is greedy algorithm, trying to swtich as left as possible and as large as possible.


class Solution {
public:
int maximumSwap(int num) {
string num_s = to_string(num);
vector<char> num_c;
for (auto it = begin(num_s); it != end(num_s); ++it) {
num_c.push_back(*it);
}
vector<int> pos(10, -1);
for (int i = 0; i < num_c.size(); ++i) {
pos[num_c[i] - '0'] = i;
}
for (int i = 0; i < num_c.size(); ++i) {
for (int j = 9; j > num_c[i] - '0'; --j) {
if (pos[j] > i) {
char tmp = num_c[i];
num_c[i] = j + '0';
num_c[pos[j]] = tmp;
int res = 0;
for (int k = 0; k < num_c.size(); ++k) {
res = res * 10 + (num_c[k] - '0');
}
return res;
}
}
}
return num;
}
};


### Leetcode 277. Find the Celebrity

#### Posted on Sep 4, 2017

Just decided to restart doing the leetcodes starting from facebook/Google tag....

This solution is very brilliant: all the other solutions in terms of number of API calls are $O(n^2)$ but this solution is $O(n)$. Moreover, the space complexity is also $O(1)$. One should image this problem as the following abstraction problems: you are given a matrix of unknown values, which can only be $1$ or $-1$. If $X_{ij} = 1$ means Person $i$ knows Person $j$, $-1$ means otherwise. It costs a lot to flip to find out what the value is. Can you flip least to find the $i$-th diagonal element such that all the elements in that row is $-1$ while all the elements in the column is $1$? (Obviously you don't care what $X_{ii}$ is). Naively, if we go through the matrix one by one, and store everything, we have $O(n^2)$ for both time and space complexity. A slightly smarter way is to use an array to store bool values and eliminate certain rows and columns when doing the check for one potential candidate. (refer to codes in the first solution). But this is still quadratic. The second solution is linear, since the number of API is at most $2n$. One can think of this a shortest path from $(0, 0)$ to $(n, i)$, with $i$ unknown and to be determined. And the constraint of the path is that the horizontal path has to be consisting of all $-1$'s and when having no choice, it has to take diagonal path. This way, the path is contained in the upper diagonal matrix and that is why doing the final check, one shouldn't double check the path that has been checked already. If the final res is $i$, that means all values before $i$ cannot be answer because it was changed to $i$ when it has no choice (knows someone) and all values after $i$ also does not work because $i$ does not know the value after $i$ (so cannot be celebrity).


// Forward declaration of the knows API.
bool knows(int a, int b);

// First Solution, Quadratic API calls
class Solution {
public:
int findCelebrity(int n) {
vector<bool> candidate(n, true);
for (int i = 0; i < n; ++i) {
for (int j = 0; j < n; ++j) {
if (candidate[i] && i != j) {
if (knows(i, j) || !knows(j, i)) {
candidate[i] = false;
break;
} else {
candidate[j] = false;
}
}
}
if (candidate[i]) return i;
}
return -1;
}
};

// Second Solution, Linear (at most 2n) API calls
class Solution {
public:
int findCelebrity(int n) {
int res = 0;
for (int j = 1; j < n; ++j) {
if (knows(res, j)) {
res = j;
}
}
for (int j = 0; j < n; ++j) {
if (j < res) {
if (knows(res, j) || !knows(j, res)) {
return -1;
}
} else if (j > res) {
if (!knows(j, res)) {
return -1;
}
}
}
return res;
}
};


### Sorting Algorithms, QuickSort and QuickSelect

#### Posted on Jun 27, 2017

Notes on QuickSort and QuickSelect.

This paper talks about the cache effects on Multi-Pivot Quicksort. 3-way or 4-way is better than 2-way pivot due to cache performance

See std::nth_element for the built-in function in C++ STL.

For more details in the quickSelect implementation, see the codes below. By the way, quickSort is the same codes, except that we do not need to return $k-th$ element and we can simply sort in-place. Basically, there is no need to pass $k$ in the function parameter.


#include <iostream>
using namespace std;

// A simple print function
void print(int* input) {
for (int i = 0; i < sizeof(input)/sizeof(*input); i++)
cout << input[i] << " ";
cout << endl;
}

int partition(int* input, int p, int r) {
int pivot = input[r];
while ( p < r ) {
while (input[p] <= pivot)
p++;
while (input[r] > pivot)
r--;
// now we have input[p] > pivot && input[r] <= pivot
// we need to swap values at p and r if p < r
if (p < r) {
int tmp = input[p];
input[p] = input[r];
input[r] = tmp;
}
}
return r;
}

int quick_select(int* input, int p, int r, int k) {
if (p == r ) return input[p];
int j = partition(input, p, r);
int length = j - p + 1;
if (length == k) return input[j];
else if (k < length) return quick_select(input, p, j - 1, k);
else return quick_select(input, j + 1, r, k - length);
}

int main() {
int A1[] = { 100, 400, 300, 500, 200 };
cout << "1st order element " << quick_select(A1, 0, 4, 1) << endl;
int A2[] = { 100, 400, 300, 500, 200 };
cout << "2nd order element " << quick_select(A2, 0, 4, 2) << endl;
int A3[] = { 100, 400, 300, 500, 200 };
cout << "3rd order element " << quick_select(A3, 0, 4, 3) << endl;
int A4[] = { 100, 400, 300, 500, 200 };
cout << "4th order element " << quick_select(A4, 0, 4, 4) << endl;
int A5[] = { 100, 400, 300, 500, 200 };
cout << "5th order element " << quick_select(A5, 0, 4, 5) << endl;
}


### Leetcode 441. Arranging Coins and Binary Search Summary

#### Posted on Jun 26, 2017


class Solution(object):
def helper(self, x):
return (1 + x) * x / 2

def arrangeCoins(self, n):
# math solution
# return int((-1 + (1 + 8 * n) ** 0.5) / 2)
left = 1
right = n
while left + 1 < right:
mid = left + (right - left) / 2
tmp = self.helper(mid)
if tmp == n:
return mid
elif tmp > n:
right = mid
else:
left = mid

if self.helper(left) <= n:
return left
else:
return right


As one can see from my node: I am deviating from the two most commone binary search code writing styles. To find the exact match of increasing sequence: one can use one of the two codes below:


def binarySearch(nums, target):
left = 0
right = len(nums)
while left < right:
# good habit, despite python doesn't have overflow issue like c++ and java
mid = left + (right - left) / 2
if nums[mid] == target:
return mid
elif nums[mid] < target:
left = mid + 1
else:
right = mid
return -1

def binarySearch(nums, target):
left = 0
right = len(nums) - 1
while left <= right:
# good habit, despite python doesn't have overflow issue like c++ and java
mid = left + (right - left) / 2
if nums[mid] == target:
return mid
elif nums[mid] < target:
left = mid + 1
else:
right = mid - 1
print(left,right)
return -1


By moving 1 into while loop, one can actually write as below. Then, question comes: what has this exactly achieved?

The answer is that this way, left and right are all in bound of the array. If not putting 1 in the while loop, either left or right will cause the so-called "out of bound" error if one writes nums[left] or nums[right]. If you don't believe, you can check the printed left and right value by running the python codes.


def binarySearch(nums, target):
left = 0
right = len(nums) - 1
while left + 1 < right:
# good habit, despite python doesn't have overflow issue like c++ and java
mid = left + (right - left) / 2
if nums[mid] == target:
return mid
elif nums[mid] < target:
left = mid
else:
right = mid
print(left,right)
return -1


As one may wonder, why is this "in-bound" property useful in practice? It is not particularly useful for finding exact match but it is very very useful for finding lowerbound or upperbound of array. For $[1,6,6,6,8,10]$, I want to find the largest index of element out of those which are smaller than or equal to the target value. To achieve this, I can simply modify my code to be the following:


def binarySearch(nums, target):
left = 0
right = len(nums) - 1
while left + 1 < right:
# good habit, despite python doesn't have overflow issue like c++ and java
mid = left + (right - left) / 2
if nums[mid] >= target:
right = mid
else:
left = mid
print(left,right)
# to find the smallest index among all the equal numbers, like 6 in the example
if nums[right] <= target:
return right
else:
return left


If one wants to find the largest index of element out of all which are smaller than the target, we should tweak the codes a little bit: just need to change last 4 lines of code to be more precise


def binarySearch(nums, target):
left = 0
right = len(nums) - 1
while left + 1 < right:
# good habit, despite python doesn't have overflow issue like c++ and java
mid = left + (right - left) / 2
if nums[mid] >= target:
right = mid
else:
left = mid
print(left,right)
# to find the largest index of element out of all which are smaller than the target,
if nums[right] < target:
return right
else:
return left


#### Do you feel this way it is much easier to memorize Binary Search Codes?

Similarly, to find the smallest index of element out of those which are larger than or equal to the target value, we should do the following:


def binarySearch(nums, target):
left = 0
right = len(nums) - 1
while left + 1 < right:
# good habit, despite python doesn't have overflow issue like c++ and java
mid = left + (right - left) / 2
if nums[mid] > target:
right = mid
else:
left = mid
print(left,right)
if nums[left] >= target:
return left
else:
return right


### Leetcode 416. Partition Equal Subset Sum and Dynamic Search Summary

#### Posted on Jun 24, 2017

My first idea is to use dfs to search all possible subsets of sum of half of the toal, however, it does not pass the large cases of leetcode. I even tried to sort the input array but still no luck. See my code:


class Solution(object):
def __init__(self):
self.res = False

def dfs(self, total, target, nums, position):
if total == target:
self.res = True
if total > target or position >= len(nums):
return
for i in range(position, len(nums)):
self.dfs(total + nums[i], target, nums, i + 1)
return

def canPartition(self, nums):
if sum(nums) % 2 == 0:
target = sum(nums) / 2
else:
return False
self.dfs(0, target, nums, 0)
return self.res


Then I search solutions online and then realize that this question should be solved using DP (Dynamic Programming). Well, it is not surprising now when you think back, right? Just recall how many times that dp beats recursion in previous questions. But I haven't really coded in python for a long time and haven't done leetcode for a long time too. So it is really a good idea to summarize as I do now. I wish I can keep this habit just like what I did for running. So the dp code is below:


class Solution(object):
def canPartition(self, nums):
if sum(nums) % 2 == 0:
target = sum(nums) / 2
else:
return False
nums.sort()
total = sum(nums)
dp = [False for i in range(target + 1)]
dp[0] = True
for i in range(len(nums)):
for j in reversed(range(nums[i], target + 1)):
dp[j] = dp[j] or dp[j - nums[i]]
return dp[-1]


If you find the code above hard to understand, don't blame yourself. Because it is pretty hard to understand. It is actually a version after space optimization. Time complexity here is $O(n^2)$ and space is $O(n)$. The more straight forward one before optimization will be following and the recurrence equation is


bool canPartition(vector<int>& nums) {
int sum = accumulate(nums.begin(), nums.end(), 0);
if(sum % 2 != 0){
return false;
}
int target = sum / 2;
int n = nums.size();
vector<vector<bool>> dp(n + 1, vector<bool>(target + 1, false));
//now we have to initialize the dp array
for(int i = 1; i <= n; ++i){
dp[i][0] = true;
}
for(int j = 0; j <= target; ++j){
dp[0][j] = false;
}
dp[0][0] = true;
// recurrence equation is dp[i][j] = dp[i-1][j] || dp[i-1][j - nums[i-1]] if j >= nums[i - 1]
// dp[i][j] = dp[i-1][j] otherwise
for(int i = 1; i <= n; ++i){
for(int j = 0; j <= target; ++j){
if(j >= nums[i-1]){
dp[i][j] = dp[i-1][j] || dp[i-1][j - nums[i-1]];
}else{
dp[i][j] = dp[i-1][j];
}
}
}
return dp[n][target];
}


To summarize, the recursion approach is a top-down approach: Start solving the given problem by breaking it down. If you see that the problem has been solved already, then just return the saved answer. If it has not been solved, solve it and save the answer. This is usually easy to think of and very intuitive. This is also referred to as Memoization. While Dynamic Programming approach is a bottom-up approach: Analyze the problem and see the order in which the sub-problems are solved and start solving from the trivial subproblem, up towards the given problem. In this process, it is guaranteed that the subproblems are solved before solving the problem. Read more from the notes by professor at UIUC.

Another similar question to this will be Leetcode 322. Coin Change.


def coinChange(coins, amount):
dp = [0] + [-1] * amount
for x in range(amount):
if dp[x] >= 0:
for c in coins:
if x + c <= amount and (dp[x + c] < 0 or dp[x + c] > dp[x] + 1):
dp[x + c] = dp[x] + 1
return dp[amount]


### Leetcode 561. Array Partition I

#### Posted on Jun 24, 2017

The idea is greedy algorithm: one needs to make the pair difference as small as possible. Don't want to waste the large numbers in the end of array after sorting.


class Solution(object):
def arrayPairSum(self, nums):
nums.sort()
return sum(nums[::2])


### Leetcode 617. Merge Two Binary Trees

#### Posted on Jun 21, 2017

I have the feeling that most questions regarding trees require to use either recursion or stack (queue) to do it iteratively.

And most questions are based on four traversal of trees, namely level order, preorder, postorder, and inorder traversal (Or using similar ideas).

So let us review how we do level order first: the basic idea is that using the queue to store root node first and then each time enqueue its left child node and right child node of the node dequeuing the queue. This way, the nodes in the same level of the tree will always be stored in a block fashion in the queue.


class Solution(object):
def levelOrder(self, root):
if not root: return []
q = collections.deque()
q.append(root)
res = []
while q:
level = []
for i in range(len(q)):
p = q.popleft()
if p.left: q.append(p.left)
if p.right: q.append(p.right)
level.append(p.val)
res.append(level)
return res



To do preorder traversal, using queue does not work, because preoder has to finsh all the left subtree of root before going to the right subtree. So here, only using stack can achieve that.


class Solution(object):
def preorderTraversal(self, root):
if not root:
return []
res = []
stack = [root]
while stack:
p = stack.pop()
res.append(p.val)
if p.right:
stack.append(p.right)
if p.left:
stack.append(p.left)
return res



Or recursively:


class Solution(object):
def preorderTraversal(self, root):
if not root:
return []
left_subtree = self.preorderTraversal(root.left)
right_subtree = self.preorderTraversal(root.right)
return [root.val] + left_subtree + right_subtree



As one can see, when you use stack, one can easily transform into recrusion, which uses call stack while when you need to use queue, such as BFS not DFS then it is much more diffciult, or/and not optimal to transform into recursion. A lot of people have discussed on this: link For level traversal, one can see code: link, which is $O(n^2)$ when the tree is skewed.

To do inorder traversal, similar reasoning compels us to use stack. See code below:


class Solution(object):
def inorderTraversal(self, root):
if not root:
return []
res = []
stack = []
p = root
while stack or p:
while p:
stack.append(p)
p = p.left
node = stack.pop()
p = node.right
res.append(node.val)
return res


One can see that the main difference is the order of pushing nodes onto stack. Instead of the order "parent, right child, left child", one has to push in the order of "root, left child of root, left child of previous one, ...., left-most child in tree, its right-brother (since left-most child's parent already in tree and by this time it has been poped out)"

Postorder is probably the most difficult among these four, of course here I mean iterative writing. See code:


class Solution(object):
def postorderTraversal(self, root):
if not root:
return []
res = []
stack = []
p = root
prev = None
while stack or p:
while p:
stack.append(p)
p = p.left
node = stack[-1]
if node.right == None or prev == node.right:
stack.pop()
res.append(node.val)
prev = node
else:
p = node.right
return res



By the way, if you are like me, who totally forgot tree terminology, then you should check the difference between depth, level, and height. See link. Another example of how important tree is in applications, is that when I tried to parse nested JSON data at work today, I suddenly realize that one can do DFS (together with Python json library) to search for keywords in nestes JSON data format.

Okay, let us do the easy Leetcode Question 619 together:


# Definition for a binary tree node.
# class TreeNode(object):
#     def __init__(self, x):
#         self.val = x
#         self.left = None
#         self.right = None

class Solution(object):
def mergeTrees(self, t1, t2):
"""
:type t1: TreeNode
:type t2: TreeNode
:rtype: TreeNode
"""
if t1 == None and t2 == None:
return None
elif t1 == None and t2 != None:
return t2
elif t2 == None and t1 != None:
return t1
else:
left_subtree = self.mergeTrees(t1.left, t2.left)
right_subtree = self.mergeTrees(t1.right, t2.right)
total = t1.val + t2.val
root = TreeNode(total)
root.left = left_subtree
root.right = right_subtree
return root


I was thinking of doing it iteratively but then I realize it is much easier to do it recursively. To do the problem iteratively, one probably has to use two queues for BFS (or stacks for DFS) rather than one call stack. I found the code below from link


/**
* Definition for a binary tree node.
* public class TreeNode {
*     int val;
*     TreeNode left;
*     TreeNode right;
*     TreeNode(int x) { val = x; }
* }
*/
public class Solution {
public TreeNode mergeTrees(TreeNode t1, TreeNode t2) {
if (t1 == null) return t2;
if (t2 == null) return t1;
Deque deque1 = new ArrayDeque<>();
Deque deque2 = new ArrayDeque<>();
Deque deque3 = new ArrayDeque<>();
// BFS
deque1.offer(t1);
deque2.offer(t2);
TreeNode t3 = new TreeNode(0);
deque3.offer(t3);
// Using res as the resulting Binary Tree
while (!deque1.isEmpty() || !deque2.isEmpty()) {
TreeNode p1 = deque1.poll();
TreeNode p2 = deque2.poll();
TreeNode p3 = deque3.poll();
p3.val = p1.val + p2.val;

if (p1.left == null || p2.left == null) {
p3.left = (p1.left == null) ? p2.left : p1.left;
} else if (p1.left != null && p2.left != null) {
deque1.offer(p1.left);
deque2.offer(p2.left);
TreeNode p4 = new TreeNode(0);
p3.left = p4;
deque3.offer(p3.left);
}
if (p1.right == null || p2.right == null) {
p3.right = (p1.right == null) ? p2.right : p1.right;
} else if (p1.right != null  && p2.right != null) {
deque1.offer(p1.right);
deque2.offer(p2.right);
TreeNode p5 = new TreeNode(0);
p3.right = p5;
deque3.offer(p3.right);
}
}
return t3;
}
}


### Leetcode 458. Poor Pigs

#### Posted on Jun 18, 2017

Someone's idea is so brilliant. I suspect he studies math before. To do this question, one has to understand how many buckets (or bucket groups) one pig can check within the test time. If using one pig, that is test_time / die_time + 1, but when using two or more pigs, it shouldn't be number_pigs * (test_time / die_time + 1).....which is my naive idea and it is too small. Instead it should be (test_time / die_time + 1)^number_pigs. The reason is one can arrange all the buckets in R^n space. Each coordinate axis should be indexed from 0 to (test_time / die_time + 1) and in total there are number_pigs coordinates. This way each pig helps to reduce the space from R^n to R^(n-1) by determing a particular bucket group in that particular coordinate axis, since there is only one poison in buckets. Therefore, one should have n = log(number_buckets)/log(test_time / die_time + 1) and rounding up. One can code using while loop or using log from math library in Python or any other language.


def poorPigs(self, buckets, minutesToDie, minutesToTest):
pigs = 0
while (minutesToTest / minutesToDie + 1) ** pigs < buckets:
pigs += 1
return pigs


Template design by Andreas Viklund