ARTS WEEK2

  1. I finished a classic two-pointer problem- minimum size subarray sum. The method of sliding window can apply to this problem perfectly.

leetcode 209

   class Solution { public int minSubArrayLen(int s, int[] nums) {
    if(nums.length==0){ return 0; } 
    int left=0; 
    int right=0;
     int length=nums.length; 
     int result=length+1;
     int sum=0;

    while(right<length){
        while(sum<s&&right<length){
            sum+=nums[right++];
        }
        while(sum>=s){
            result=Math.min(result,right-left);
            sum -= nums[left++];
        }
    }

    if(result==length+1){
        return 0;
    }else{
        return result;
    }
}

2. Review

first part of Distributed Systems for Fun and Profit chapter 1

This chapter talks about some basic concepts of distributed systems, such as performance and availability.

Distributed systems are constrained by two physical factors:

  • the number of nodes (which increases with the required storage and computation capacity)

  • the distance between nodes (information travels, at best, at the speed of light)

Working within those constraints:

  • an increase in the number of independent nodes increases the probability of failure in a system (reducing availability and increasing administrative costs)

  • an increase in the number of independent nodes may increase the need for communication between nodes (reducing performance as scale increases)

  • an increase in geographic distance increases the minimum latency for communication between distant nodes (reducing performance for certain operations)

At first glance, I think more nodes bring more benefits rather than problems. I think as the traffic can be divided to different nodes, then the problem of computing(the traffic cost a lot of time for CPU) can be tackled. I just ignored the failure and the communication part of a distributed system!

3.

tip:

how to solve bugs in an old system?

First, find all the classes and methods that are related to the bug.

Second, read the code and the call method.

Third, add some logs and run the code.(mock or write a main method)

Finally, solve the bug!

4. Share:

Kafka文档设计部分翻译

4.2 持久性

不要害怕文件系统

文件系统和网络差不多快

1高通量的文件读写比一般的disk seek快很多。可预测性强的使用方式。操作系统优化好。现代操作系统的策略是先读后写。大批量拿数据,小批量逻辑写。

2磁盘缓存大量用主存,如果不是直接i/o,即使是进程中缓存,在OS的页缓存。

3Kafka基于JVM,内存用量常是数据用量的两倍,堆内数据增长,垃圾收集变得困难。所以最终选择了文件系统,并认为pagecache优于内存缓存。我们至少因为自动获得全部空闲内存而获得了两倍空间,而存储压缩的比特结构而不存储一个个对象,可能用让我们得到了之前的两倍。

32GB的机器上,我们就能得到28-30GB的缓存,而且没有GC造成的损失。并且这个缓存能保持warm,即使服务重启,而进程内存则需要重新恢复(10GB月需要10min),或者缓存完全是冷的。而且这也简化了代码逻辑,因为这两者的coherence都归OS管。

最终设计是:所以数据立刻写到一个文件系统中的持久化log,不需要一次性的涌入文件系统。

这种以pagecache为中心的设计可以参见Varnish此文https://varnish-cache.org/docs/trunk/phk/notes.html

Last updated