My Project Is Not Object Oriented • 管理、修改、重構遺留程式碼的藝術

概述#

雖然 object orientation 已經普及，但許多語言和程式設計方式並非物件導向——rule-based 語言、functional programming 語言、constraint-based 語言，以及純粹的 procedural 語言（C、COBOL、FORTRAN、Pascal、BASIC）。

Procedural 語言在 legacy 環境中尤其具有挑戰性。在 procedural code 中引入 unit test 的選擇非常有限，但最重要的策略是：

尋找 pinch point（夾點）
使用 link seam 來打破依賴
如果語言有 macro preprocessor，使用 preprocessing seam

因為在 procedural code 中打破依賴非常困難，最佳策略通常是先取得一大塊程式碼的測試，然後在有測試保護的情況下進行開發。

An Easy Case#

Procedural code 不一定都是問題。以下是 Linux kernel 中的一個 C 函數：

void set_writetime(struct buffer_head *buf, int flag)
{
    int newtime;

    if (buffer_dirty(buf)) {
        /* Move buffer to dirty list if jiffies is clear */
        newtime = jiffies + (flag ? bdf_prm.b_un.age_super :
            bdf_prm.b_un.age_buffer);
        if(!buf->b_flushtime || buf->b_flushtime > newtime)
            buf->b_flushtime = newtime;
    } else {
        buf->b_flushtime = 0;
    }
}

要測試這個函數，我們可以設定 jiffies 的值，建立一個 buffer_head，傳入函數，然後檢查呼叫後的值。這是相對簡單的情況。

但許多函數沒這麼幸運——它們呼叫其他函數，而那些函數做 I/O 或呼叫 vendor library。

A Hard Case#

以下是一個需要修改的 C 函數，呼叫了帶有副作用的 ksr_notify：

#include "ksrlib.h"

int scan_packets(struct rnode_packet *packet, int flag)
{
    struct rnode_packet *current = packet;
    int scan_result, err = 0;

    while(current) {
        scan_result = loc_scan(current->body, flag);
        if(scan_result & INVALID_PORT) {
            ksr_notify(scan_result, current);
        }
        ...
        current = current->next;
    }
    return err;
}

方法一：Link Seam#

建立一個 library，包含與原函數同名但不做任何事的 fake 函數。測試時連結這個 fake library 而非真正的 library。

缺點：link seam 在 link time 發生替換，每個 executable 只能有一個函數定義。如果不同測試需要 ksr_notify 有不同行為，就必須在 fake 中放入條件邏輯。

方法二：Preprocessing Seam#

C 有 macro preprocessor，可以用來消除依賴：

#include "ksrlib.h"

#ifdef TESTING
#define ksr_notify(code,packet)
#endif

int scan_packets(struct rnode_packet *packet, int flag)
{
    // ... 函數本體不變 ...
}

#ifdef TESTING
#include <assert.h>
int main() {
    struct rnode_packet packet;
    packet.body = ...
    ...
    int err = scan_packets(&packet, DUP_SCAN);
    assert(err & INVALID_PORT);
    ...
    return 0;
}
#endif

更好的組織方式：File Inclusion#

將測試和 production code 混在同一檔案中不太好導航。可以用 file inclusion 將測試放在不同檔案中：

Production 檔案（包含測試 hook）：

#include "ksrlib.h"
#include "scannertestdefs.h"

int scan_packets(struct rnode_packet *packet, int flag)
{
    // ... 函數本體 ...
}

#include "testscanner.tst"

測試檔案 testscanner.tst：

#ifdef TESTING
#include <assert.h>
void test_port_invalid() {
    struct rnode_packet packet;
    // ...
    int err = scan_packets(&packet, DUP_SCAN);
    assert(err & INVALID_PORT);
}

void test_body_not_corrupt() {
    // ...
}
#endif

雖然 macro preprocessor 容易被濫用，但在測試 legacy code 的脈絡中非常有用。只要限制 macro 只用於讓測試環境下的程式碼正常運作，就不需要太擔心會影響 production code。

Adding New Behavior#

在 procedural legacy code 中，偏好引入新函數而非在舊函數中加入程式碼。至少，新函數可以被獨立測試。

使用 TDD#

用 test-driven development (TDD) 撰寫新函數。將純邏輯放在一組函數中，讓它們保持無依賴；用小的 wrapper 函數來綁定邏輯和依賴。

範例：`send_command`#

原始做法——邏輯和依賴（mart_key_send）混在一起：

void send_command(int id, char *name, char *command_string) {
    char *message, *header;
    if (id == KEY_TRUM) {
        message = ralloc(sizeof(int) + HEADER_LEN + ...
        ...
    } else {
        ...
    }
    sprintf(message, "%s%s%s", header, command_string, footer);
    mart_key_send(message);
    free(message);
}

更好的做法——將 mart_key_send 之前的邏輯抽取到 form_command：

char *form_command(int id, char *name, char *command_string)
{
    char *message, *header;
    if (id == KEY_TRUM) {
        message = ralloc(sizeof(int) + HEADER_LEN + ...
        ...
    } else {
        ...
    }
    sprintf(message, "%s%s%s", header, command_string, footer);
    return message;
}

void send_command(int id, char *name, char *command_string) {
    char *command = form_command(id, name, command_string);
    mart_key_send(command);
    free(message);
}

現在可以獨立測試 form_command：

char *command = form_command(1, "Mike Ratledge", "56:78:cusp-:78");
assert(!strcmp("<-rsp-Mike Ratledge><56:78:cusp-:78><-rspr>", command));

Function Pointer 技巧#

當函數中穿插著大量外部呼叫時，可以使用 function pointer 來建立 seam：

struct database
{
    void (*retrieve)(struct record_id id);
    void (*update)(struct record_id id, struct record_set *record);
    ...
};

在 production code 中，function pointer 指向真正的 database 函數；在測試中，指向 fake。這讓我們能以類似物件導向的風格呼叫：

extern struct database db;
db.update(load->id, loan->record);

Taking Advantage of Object Orientation#

許多 procedural 語言已經發展出 OO 擴展——Visual Basic、COBOL 和 Fortran 都有 OO 擴展，大多數 C 編譯器也能編譯 C++。

當語言支援物件導向時，可以用 Encapsulate Global References 來取得 object seam。

C 到 C++ 的遷移範例#

將 ksr_notify 函數包裝到 class 中：

class ResultNotifier
{
public:
    virtual void ksr_notify(int scan_result,
                            struct rnode_packet *packet);
};

// 預設實作委派給原始的 C 函數
extern "C" void ksr_notify(int scan_result,
                            struct rnode_packet *packet);

void ResultNotifier::ksr_notify(int scan_result,
                                struct rnode_packet *packet)
{
    ::ksr_notify(scan_result, packet);
}

宣告全域實例，然後修改使用處：

extern ResultNotifier globalResultNotifier;

int scan_packets(struct rnode_packet *packet, int flag)
{
    // ...
    while(current) {
        scan_result = loc_scan(current->body, flag);
        if(scan_result & INVALID_PORT) {
            globalResultNotifier.ksr_notify(scan_result, current);
        }
        // ...
    }
    return err;
}

接著用 Encapsulate Global References 將 scan_packets 放入 Scanner class，用 Parameterize Constructor 注入 ResultNotifier：

class Scanner
{
private:
    ResultNotifier& notifier;
public:
    Scanner();
    Scanner(ResultNotifier& notifier);
    int scan_packets(struct rnode_packet *packet, int flag);
};

現在測試時可以替換 ResultNotifier。

It’s All Object Oriented#

一個有趣的觀察：所有 procedural 程式本質上都是物件導向的——只不過大多數只包含一個物件。

想像一個有 100 個函數的程式。把所有函數宣告放入一個 class：

class program
{
public:
    ...
    int db_find(char *id, unsigned int mnemonic_id,
                struct db_rec **rec);
    ...
    void process_run(struct gfh_task **tasks, int task_count);
    ...
};

這個改變不會改變系統的行為——舊的 C 系統本質上就是一個大物件。當我們開始用 Encapsulate Global References 拆分出新物件時，我們是在以更容易工作的方式細分系統。

Procedural code 的選項比 object-oriented code 少，但仍然可以取得進展。如果你的 procedural 語言有 OO 後繼者，建議朝那個方向遷移。Object seam 在改善設計方面遠比 link seam 和 preprocessing seam 有用。

總結#

在非物件導向的專案中，核心策略是：

使用 link seam 和 preprocessing seam 打破依賴以進行測試
引入新函數而非修改舊函數，保持新程式碼可測試
利用 function pointer 在 C 中模擬物件導向的 seam
如果語言支援，逐步遷移到物件導向以獲得更好的 seam

概述#

An Easy Case#

A Hard Case#

方法一：Link Seam#

方法二：Preprocessing Seam#

更好的組織方式：File Inclusion#

Adding New Behavior#

使用 TDD#

範例：send_command#

Function Pointer 技巧#

Taking Advantage of Object Orientation#

C 到 C++ 的遷移範例#

It’s All Object Oriented#

總結#

範例：`send_command`#