点这里👇星标关注,获取最新资讯!
👉目录
1 什么是后台工程师的职业能力
2 如何提高抽象问题的能力和解决问题的能力
这两年在日常工作中,接触了不少刚进腾讯的新人开发,发现了大部分新人都存在的一些共性问题。由于工作繁忙,往往很难出现一个特别合适的机会,系统地跟他们分享我的经验和观点。最近刚好有接触一些终端开发转后端开发的团队,有所触动,于是决定写下这篇文章,分享一下我对后台开发能力提升的一点思考。希望能帮到大家。
01
// A ServerCodec implements reading of RPC requests and writing of
// RPC responses for the server side of an RPC session.
// The server calls ReadRequestHeader and ReadRequestBody in pairs
// to read requests from the connection, and it calls WriteResponse to
// write a response back. The server calls Close when finished with the
// connection. ReadRequestBody may be called with a nil
// argument to force the body of the request to be read and discarded.
// See NewClient's comment for information about concurrent access.
type ServerCodec interface {
ReadRequestHeader(*Request) error
ReadRequestBody(interface{}) error
WriteResponse(*Response, interface{}) error
// Close can be called multiple times and must be idempotent.
Close() error
}
这段代码取自 golang 标准库 net.rpc 包 server.go,是 golang 标准库对于通用 rpc 服务的服务端接收请求和响应请求的能力的抽象。请求接收后,处理函数就可以不再关注收发包的的细节(在这套抽象下,想关心可以做到)。
ServerCodec 这个抽象,具体的实现由使用者提供,使用者可以选择所有 stream-like 的数据输入\输出,可以是 tcp v4\v6、domain socket、甚至各类管道(pipe)比如标准输入输出。可以支持各种协议,比如 oidb msg head + pb msg body、pdu msg head + wup msg body。你可以把读请求数据流和返回数据包数据流做到memroy zero copy。既做到简单,一看就明白它的用处,又灵活多变,不带来不必要的性能损耗。类似的功能,我们看看 spp 的函数抽象:
int spp_handle_input(unsigned flow, void* arg1, void* arg2)
看似简单,却让人更摸不着头脑该怎么使用。也是丢失了很多能力,domain socket 的支持作为一个特性,需要框架升级,做不到使用 pipe 等等数据流作为输入。stream-like 的输出场景,难免需要把多次接收的多片数据拼接成连续内存,再调用这个函数让业务代码来判断一个业务包是否接收完整了。最优解无非使用 ring buffer(循环缓冲区)来减少 memory copy,但是框架既要管理 read 数据又不知道一个业务包可能有多大,难免遇到 ring buffer 长度不够放下一个业务包,ring buffer 扩容就必须引入 memory copy 了。
再举一个代码的例子,我们来看这个抽象:
// Reader is the interface that wraps the basic Read method.
//
// Read reads up to len(p) bytes into p. It returns the number of bytes
// read (0 <= n <= len(p)) and any error encountered. Even if Read
// returns n < len(p), it may use all of p as scratch space during the call.
// If some data is available but not len(p) bytes, Read conventionally
// returns what is available instead of waiting for more.
//
// When Read encounters an error or end-of-file condition after
// successfully reading n > 0 bytes, it returns the number of
// bytes read. It may return the (non-nil) error from the same call
// or return the error (and n == 0) from a subsequent call.
// An instance of this general case is that a Reader returning
// a non-zero number of bytes at the end of the input stream may
// return either err == EOF or err == nil. The next Read should
// return 0, EOF.
//
// Callers should always process the n > 0 bytes returned before
// considering the error err. Doing so correctly handles I/O errors
// that happen after reading some bytes and also both of the
// allowed EOF behaviors.
//
// Implementations of Read are discouraged from returning a
// zero byte count with a nil error, except when len(p) == 0.
// Callers should treat a return of 0 and nil as indicating that
// nothing happened; in particular it does not indicate EOF.
//
// Implementations must not retain p.
type Reader interface {
Read(p []byte) (n int, err error)
}
// Writer is the interface that wraps the basic Write method.
//
// Write writes len(p) bytes from p to the underlying data stream.
// It returns the number of bytes written from p (0 <= n <= len(p))
// and any error encountered that caused the write to stop early.
// Write must return a non-nil error if it returns n < len(p).
// Write must not modify the slice data, even temporarily.
//
// Implementations must not retain p.
type Writer interface {
Write(p []byte) (n int, err error)
}
// Closer is the interface that wraps the basic Close method.
//
// The behavior of Close after the first call is undefined.
// Specific implementations may document their own behavior.
type Closer interface {
Close() error
}
// Seeker is the interface that wraps the basic Seek method.
//
// Seek sets the offset for the next Read or Write to offset,
// interpreted according to whence:
// SeekStart means relative to the start of the file,
// SeekCurrent means relative to the current offset, and
// SeekEnd means relative to the end.
// Seek returns the new offset relative to the start of the
// file and an error, if any.
//
// Seeking to an offset before the start of the file is an error.
// Seeking to any positive offset is legal, but the behavior of subsequent
// I/O operations on the underlying object is implementation-dependent.
type Seeker interface {
Seek(offset int64, whence int) (int64, error)
}
// ReadWriter is the interface that groups the basic Read and Write methods.
type ReadWriter interface {
Reader
Writer
}
// ReadCloser is the interface that groups the basic Read and Close methods.
type ReadCloser interface {
Reader
Closer
}
// WriteCloser is the interface that groups the basic Write and Close methods.
type WriteCloser interface {
Writer
Closer
}
// ReadWriteCloser is the interface that groups the basic Read, Write and Close methods.
type ReadWriteCloser interface {
Reader
Writer
Closer
}
// ReadSeeker is the interface that groups the basic Read and Seek methods.
type ReadSeeker interface {
Reader
Seeker
}
// WriteSeeker is the interface that groups the basic Write and Seek methods.
type WriteSeeker interface {
Writer
Seeker
}
// ReadWriteSeeker is the interface that groups the basic Read, Write and Seek methods.
type ReadWriteSeeker interface {
Reader
Writer
Seeker
}
读写需要 buffer 能力,合并多次小数据读写,减少读写次数,func NewWriter(w io.Writer) *bufio.Writer,从 io.Writer 变成另一个io.Writer(bufio.Writer 比 io.Writer 具备更多能力,可以当成 io.Writer 使用、传递),具备更多的类型的能力,却没有带来 c++ 标准库的各种 stream 类型定义的心智负担。更典型的 func LimitReader(r Reader, n int64) Reader ,完全是从 io.Writer 变成 io.Writer,具备了最多读 n 个 byte 就返回 io.EOF 的能力,没有带来任何其他心智负担。
这是 golang 语言的例子,在 rust/scala 语言里面,用 trait 这个特性也可以做到类似的效果(但是它们需要明确地 impl 和 extends,依赖管理也变得复杂起来)。读写的时候叠加各种特性,使用这套 read/write 的抽象,延伸出 bufio、io.LimitReader 这些没有心智负担的用法,就是特别好的抽象。接着,我们来看一个 LeetCode 的算法题:
*
* @lc app=leetcode id=212 lang=golang
*
* [212] Word Search II
*
* https://leetcode.com/problems/word-search-ii/description/
*
* algorithms
* Hard (27.69%)
* Total Accepted: 100.6K
* Total Submissions: 363.4K
* Testcase Example: '[["o","a","a","n"],["e","t","a","e"],["i","h","k","r"],["i","f","l","v"]]\n["oath","pea","eat","rain"]'
*
* Given a 2D board and a list of words from the dictionary, find all words in
* the board.
*
* Each word must be constructed from letters of sequentially adjacent cell,
* where "adjacent" cells are those horizontally or vertically neighboring. The
* same letter cell may not be used more than once in a word.
*
* Example:
*
*
* Input:
* words = ["oath","pea","eat","rain"] and board =
* [
* ['o','a','a','n'],
* ['e','t','a','e'],
* ['i','h','k','r'],
* ['i','f','l','v']
* ]
*
* Output: ["eat","oath"]
*
*
* Note:
* You may assume that all inputs are consist of lowercase letters a-z.
*/
public List<String> findWords(char[][] board, String[] words) {}
这个题,大家可以先思考一下。这个问题初看起来挺麻烦,但如果你具备较强的抽象能力和算法技能知识库,这个问题就可以快速抽象成一个 DFS(深度优先搜索)+ Trie (字典树) 组合问题,迅速找到极高效的解法(15ms):
public List<String> findWords(char[][] board, String[] words) {
List<String> res = new ArrayList<>();
TrieNode root = buildTrie(words);
for (int i = 0; i < board.length; i++) {
for (int j = 0; j < board[0].length; j++) {
dfs (board, i, j, root, res);
}
}
return res;
}
public void dfs(char[][] board, int i, int j, TrieNode p, List<String> res) {
char c = board[i][j];
if (c == '#' || p.next[c - 'a'] == null) return;
p = p.next[c - 'a'];
if (p.word != null) { // found one
res.add(p.word);
p.word = null; // de-duplicate
}
board[i][j] = '#';
if (i > 0) dfs(board, i - 1, j ,p, res);
if (j > 0) dfs(board, i, j - 1, p, res);
if (i < board.length - 1) dfs(board, i + 1, j, p, res);
if (j < board[0].length - 1) dfs(board, i, j + 1, p, res);
board[i][j] = c;
}
public TrieNode buildTrie(String[] words) {
TrieNode root = new TrieNode();
for (String w : words) {
TrieNode p = root;
for (char c : w.toCharArray()) {
int i = c - 'a';
if (p.next[i] == null) p.next[i] = new TrieNode();
p = p.next[i];
}
p.word = w;
}
return root;
}
class TrieNode {
TrieNode[] next = new TrieNode[26];
String word;
}
然后,下面还有一个对问题抽象得不好,有很多冗余细节的做法(献丑了,这是我给出的第一个解法),效果是400ms,在 LeetCode 给的测试用例里比上面这个最优解慢了10倍。甚至阅读起来,也更难理解:
func findWords(board [][]byte, words []string) []string {
listBoard := constructListBoard(board)
bMap := constructDict(listBoard)
resultTmpMap := make(map[string]uint8)
var found bool
for i := range words {
_, found = resultTmpMap[words[i]]
if found {
continue
}
if search([]byte(words[i]), bMap) {
resultTmpMap[words[i]] = 0
}
}
result := make([]string, 0, len(resultTmpMap))
for k, _ := range resultTmpMap {
result = append(result, k)
}
return result
}
func constructListBoard(board [][]byte) (listBoard [][]*fourWayNode) {
lineCount := len(board)
listBoard = make([][]*fourWayNode, lineCount)
colCount := len(board[0])
for i := range listBoard {
listBoard[i] = make([]*fourWayNode, colCount)
for j := range listBoard[i] {
listBoard[i][j] = &fourWayNode{
b: board[i][j],
line: i,
col: j,
}
}
}
for i := range board {
for j := range board[i] {
// above
if i-1 >= 0 {
listBoard[i][j].above = listBoard[i-1][j]
}
// right
if j+1 < colCount {
listBoard[i][j].right = listBoard[i][j+1]
}
// below
if i+1 < lineCount {
listBoard[i][j].below = listBoard[i+1][j]
}
// left
if j-1 >= 0 {
listBoard[i][j].left = listBoard[i][j-1]
}
}
}
return
}
func constructDict(listBoard [][]*fourWayNode) (result map[string][]*searchPath) {
result = make(map[string][]*searchPath)
var tmp []*searchPath
for i := range listBoard {
for j := range listBoard[i] {
tmp, _ = result[string(listBoard[i][j].b)]
result[string(listBoard[i][j].b)] = append(tmp, createPath(nil, listBoard[i][j], fmt.Sprintf("%d_%d", i, j)))
}
}
return
}
func search(word []byte, bMap map[string][]*searchPath) bool {
found := true
var tmp, startVec, nextStartVec []*searchPath
var tmpSlice string
for k := range word {
tmpSlice = string(word[:k+1])
if found {
tmp, found = bMap[tmpSlice]
if found {
if k == len(word)-1 {
return true
}
startVec = tmp
continue
}
}
for j := range startVec {
nextStartVec = append(nextStartVec, anyMatch(startVec[j], word[k])...)
}
if len(nextStartVec) > 0 {
bMap[tmpSlice] = nextStartVec
if k == len(word)-1 {
return true
}
}
if len(nextStartVec) == 0 {
break
}
startVec = nextStartVec
nextStartVec = nil
}
return false
}
type searchPath struct {
goThrough map[string]uint8
current *fourWayNode
}
func createPath(lastStep *searchPath, newStep *fourWayNode, newKey string) (newPath *searchPath) {
var outPath searchPath
outPath.goThrough = make(map[string]uint8)
if lastStep != nil {
for k, _ := range lastStep.goThrough {
outPath.goThrough[k] = 0
}
}
outPath.goThrough[newKey] = 0
outPath.current = newStep
return &outPath
}
func anyMatch(path *searchPath, b byte) (out []*searchPath) {
var tmpKey string
var found bool
if path.current.above != nil && path.current.above.b == b {
tmpKey = fmt.Sprintf("%d_%d", path.current.above.line, path.current.above.col)
_, found = path.goThrough[tmpKey]
if !found {
out = append(out, createPath(path, path.current.above, tmpKey))
}
}
if path.current.right != nil && path.current.right.b == b {
tmpKey = fmt.Sprintf("%d_%d", path.current.right.line, path.current.right.col)
_, found = path.goThrough[tmpKey]
if !found {
out = append(out, createPath(path, path.current.right, tmpKey))
}
}
if path.current.below != nil && path.current.below.b == b {
tmpKey = fmt.Sprintf("%d_%d", path.current.below.line, path.current.below.col)
_, found = path.goThrough[tmpKey]
if !found {
out = append(out, createPath(path, path.current.below, tmpKey))
}
}
if path.current.left != nil && path.current.left.b == b {
tmpKey = fmt.Sprintf("%d_%d", path.current.left.line, path.current.left.col)
_, found = path.goThrough[tmpKey]
if !found {
out = append(out, createPath(path, path.current.left, tmpKey))
}
}
return
}
type fourWayNode struct {
b byte
line int
col int
above *fourWayNode
right *fourWayNode
below *fourWayNode
left *fourWayNode
}
再比如,假设要做(使用)一个后台存储,帮助我实现迅速“查询我哪些好友也在玩王者荣耀\刺激战场\火影忍者”,怎么做?把这个业务抽象一下,就是做一个大 hash\红黑树\b-树,存储所有玩家,并用我的好友去查询,哪些在这个大结构里。为了速度,存储这个结构使用内存,进而去处理持久化\落地冷备\多机划分数据片\多机数据同步热备等等问题。
再再比如,rust 用 Box<T> (堆上内存)、Rc<T> (引用计数)、RefCell<T> (突破权限引用)、Weak<T>(没有所有权的持有)等对象持有方法构建起了精细的、编译器就可以明确的对象所有权体系和生命周期体系。做到让对象更及时地释放、让编译器更多地帮我们检查出不安全(发生内存泄露、空指针 coredump)的代码。
所有这些例子,都是在一个场景下,精准地、层级恰当地把问题分析、抽象起来。然后用我们才有可能用后台的机器集群给我们带来的资源去解决这些问题。高效地解决这些问题,保证较高的开发效率、很高的服务吞吐量、较高的资源有效利用率。
简单说就是,写代码又快、单机 request per second 又高。
02
如果是自律能力、探索兴趣比较强的同学,应该早早地学习 golang、rust 的使用。如果是比较被动的同学,至少也应该花时间学习已经成名的 golang 这个专门为后台工程开发而设计的工具。
多看看好的代码,看看别人是怎么做到优秀的。最容易获得,又注释良好容易看懂的,当然是 golang 各个标准库的源码。就挑你感兴趣的看:http 包的实现,json 标准库的实现等等。
再次,如果有机会,我们就可以关注和了解那些业务系统的设计实现:看点、视频号推荐系统是如何实现的;神盾系统是如何实现的;ckafka 是如何实现的;ckv+ 是如何实现的;etcd 等容灾做得比较好的配置管理服务是如何实现的,等等。
“学而不思则罔”。仅仅是学,仅仅是看,肯定是不行的。如果学而不思,你就觉得内心得到安慰,那就是自欺欺人了。就像有人觉得自己听了罗振宇说书,就感觉自己自己真的像认真读完他说的书一样收获了很多一样,只能自己骗自己。这里面缺失了思辨,反复琢磨的过程。
我很喜欢一个句话,“纸上得来终觉浅,绝知此事要躬行“。特别是技术,如果你没有做过集群服务管理,你很难直观理解 docker 对于集群管理带来的方便。你一直写 java,别人告诉你 golang 开发效率更高,你没有用 golang 好好写几个服务,你很难理解 golang 在性能不下降的情况下,对于开发效率的提升。在这个“躬行”的过程中,或多或少,你会去思考,思考技术、方案、算法、代码间微妙的差距,看到最终效果的差异,又会激发你去分析差异、更好地理解产生差异的双方。
同时,有一个有趣的现象。干同一个事情,不同的人经历的思考和积累的感悟,会有不同,这其中应该另有一套方法论,大家得自己去思考积累。我个人的建议就是,思考问题,总要尝试去拔高一层、总结一层、抽象一层,经年累月这样去做,你可能会有一些意外的收获,开始能辨别自己哪些时候总结得好,哪些时候总结得不好。辩证地去思考,在总结的时候,尝试 pk 自己这个总结的准确性、正确性,积累起来,大家曾经都是某种程度上的学霸,思考能力不差,肯定能有收获。
随着见识的增加,思考的积累。开始能够辨别,对同一个问题的不同抽象方法,哪个更有“品味”,自己也总是选择更有品味的做法,不断优化自己的抽象,不断优化自己的做法,当你看到以前自己自信满满写下的较好的代码里的不好的地方的时候,你就更进一步了!
IDCF