測試 • 無瑕的程式碼函數式設計篇

TDD 與語言無關#

本書中的單元測試幾乎都用了 speclj ^↗（讀作 “speckle”）撰寫，這是 Micah Martin 等人開發的 Clojure 框架，與 Ruby 的 RSpec 風格相近。

作者從 20 多年前開始實踐 TDD，用過 Java、C#、C、C++、Ruby、Python、Lua、Clojure 等。結論很簡單：紀律和語言無關。
Clojure 是函式式語言這件事完全不改變測試策略，TDD 流程一樣是「測試先行幾秒、再寫程式」。

那 REPL 呢#

許多函式式程式設計師說「我都在 REPL 試，不需要 TDD」。作者本人也常在 REPL 做實驗，但他會把學到的東西沉澱成測試。

測試像鑽石——永遠都在；REPL 的實驗到了隔天就消失了。

那 Mock 呢#

Mock（更正式的名字是 test-double）是 TDD 用來把測試與系統的某些大塊隔離出去的工具。它創造代表那些大塊的物件、利用 LSP 把它們替換進來。

由於 LSP 常被視為 OO 原則、OO 中的 mock 又依賴多型介面，「函式式語言不支援 mock」是一個都市傳說。

但前面 SOLID 章節已經證明：LSP 在函式式語言一樣成立，多型介面也很容易做出來——函式式語言寫 mock 一點也不困難。

範例（取自 more-speech）：

(it "adds an unrooted article id to a tab"
  (let [message-id 1
        messages {message-id {:tags []}}
        event-context (atom {:text-event-map messages})]
    (reset! ui-context {:event-context event-context})
    (with-redefs [swing-util/add-id-to-tab (stub :add-id-to-tab)
                  swing-util/relaunch     (stub :relaunch)]
      (add-article-to-tab 1 "tab" nil)
      (should-have-invoked :relaunch)
      (should-have-invoked :add-id-to-tab
                           {:with ["tab" :selected 1]}))))

關鍵：

with-redefs：暫時把指定函式換成命名 stub
stub 接受任意參數、不回傳值，但會記住自己被呼叫的情況（這嚴格來說讓它變成了 spy）
should-have-invoked 檢查 stub 確實被呼叫，並可指定預期的參數

Property-Based Testing#

跟函式式程式設計師混久了，總會聽到 QuickCheck ^↗ 與屬性式測試（property-based testing）。

屬性式測試常被拿來「反 TDD」。作者不打算捲入這場爭論，而是要展示屬性式測試與 TDD 完美並存的力量。

它做兩件事：

隨機產生輸入
強大的「縮小（shrinking）」缺陷隔離策略

用屬性式測試補強 prime factors#

回到 prime factors。經過 TDD 寫出來的版本：

(defn factors-of [n]
  (loop [factors [] n n divisor 2]
    (if (> n 1)
      (cond
        (> divisor (Math/sqrt n)) (conj factors n)
        (= 0 (mod n divisor))     (recur (conj factors divisor)
                                         (quot n divisor)
                                         divisor)
        :else                     (recur factors n (inc divisor)))
      factors)))

可以驗的兩個重要屬性：

乘積回到原數：reduce * factors == n
每個因數都是質數

屬性 1：乘積回到原數#

(def gen-inputs (gen/large-integer* {:min 1 :max 1000000000}))

(describe "properties"
  (it "multiplies out properly"
    (should-be
      :result
      (tc/quick-check
        1000
        (prop/for-all
          [n gen-inputs]
          (let [factors (factors-of n)]
            (= n (reduce * factors))))))))

這段測試用 test.check ^↗（Clojure 的 QuickCheck 移植）。讓 quick-check 跑 1,000 次：每次用隨機整數計算因數、再乘回去比對。

屬性 2：因數都是質數#

(defn is-prime? [n]
  (if (= 2 n)
    true
    (loop [candidates (range 2 (inc (Math/sqrt n)))]
      (if (empty? candidates)
        true
        (if (zero? (rem n (first candidates)))
          false
          (recur (rest candidates)))))))

(it "they are all prime"
  (should-be
    :result
    (tc/quick-check
      1000
      (prop/for-all
        [n gen-inputs]
        (let [factors (factors-of n)]
          (every? is-prime? factors))))))

兩個屬性同時成立——這幾乎就是「質因數分解」的定義。

Property-Based Testing 作為診斷工具#

回到 Video Store 例子。make-statement-data 把 rental-order 變成 statement-data，可以對輸入下 spec、用 generator 產生隨機輸入做大規模驗證。

為輸入定義 spec#

(s/def ::name string?)
(s/def ::customer (s/keys :req-un [::name]))
(s/def ::title string?)
(s/def ::type #{:regular :childrens :new-release})
(s/def ::movie (s/keys :req-un [::title ::type]))
(s/def ::days pos-int?)
(s/def ::rental (s/keys :req-un [::days ::movie]))
(s/def ::rentals (s/coll-of ::rental))
(s/def ::rental-order (s/keys :req-un [::customer ::rentals]))

寫 generator#

(def gen-customer-name (gen/such-that not-empty gen/string-alphanumeric))
(def gen-customer (gen/fmap (fn [name] {:name name}) gen-customer-name))
(def gen-days (gen/elements (range 1 100)))
(def gen-movie-type (gen/elements [:regular :childrens :new-release]))

(def gen-movie
  (gen/fmap (fn [[title type]] {:title title :type type})
            (gen/tuple gen/string-alphanumeric gen-movie-type)))

(def gen-rental
  (gen/fmap (fn [[movie days]] {:movie movie :days days})
            (gen/tuple gen-movie gen-days)))

(def gen-rentals (gen/such-that not-empty (gen/vector gen-rental)))

(def gen-rental-order
  (gen/fmap (fn [[customer rentals]]
              {:customer customer :rentals rentals})
            (gen/tuple gen-customer gen-rentals)))

(def gen-policy (gen/elements
                  [(make-normal-policy)
                   (make-buy-two-get-one-free-policy)]))

Generator 與 spec 之間的相似性令人發毛——這正是它們相得益彰的地方。

驗證 generator 本身產出有效資料#

(it "generates valid rental orders"
  (should-be
    :result
    (tc/quick-check
      100
      (prop/for-all
        [rental-order gen-rental-order]
        (nil? (s/explain-data ::constructors/rental-order rental-order))))))

s/explain-data 在資料符合 spec 時回傳 nil——通過就代表產生器正確。

對 statement-data 下屬性#

(it "statement data totals are consistent under all policies"
  (should-be
    :result
    (tc/quick-check
      100
      (prop/for-all
        [rental-order gen-rental-order
         policy gen-policy]
        (let [statement-data (make-statement-data policy rental-order)
              prices (map :price (:movies statement-data))
              owed (:owed statement-data)]
          (= owed (reduce + prices)))))))

這個 quick-check 有個 bug——你看出來了嗎？

執行後會看到失敗訊息中含有 :shrunk 區塊：

{:shrunk
 {:smallest
  [{:customer {:name "0"},
    :rentals [{:movie {:title "", :type :regular}, :days 1}
              {:movie {:title "", :type :regular}, :days 1}
              {:movie {:title "", :type :regular}, :days 1}]}
   {:type :video-store.buy-two-get-one-free-policy/buy-two-get-one-free}]
  ...}}

真正的魔法在 :shrunk：當 quick-check 找到失敗案例後，會持續尋找仍能重現失敗的最小輸入。
看 :smallest——3 部電影、買二送一政策。原因明顯：buy-two-get-one-free 政策下，三部電影的「逐項列表總和」不等於 :owed（因為其中一部免費）。
這個自動「縮小」行為，正是屬性式測試作為診斷工具的關鍵價值。

為何 OO 較少使用 quick-check#

為什麼像 quick-check 這類工具在 OO 語言中沒那麼流行？或許是因為它們搭配純函式最有效。要在可變系統中設定 generator 與屬性，理論上可行，但比在不可變系統中複雜許多。