當 DNS 解析器遇到 Go fuzzer 【未翻譯】

來源:互聯網
上載者:User
這是一個建立於 的文章,其中的資訊可能已經有所發展或是發生改變。

Here at CloudFlare we are heavy users of the github.com/miekg/dns Go DNS library and we make sure to contribute to its development as much as possible. Therefore when Dmitry Vyukov published go-fuzz and started to uncover tens of bugs in the Go standard library, our task was clear.

Hot Fuzz

Fuzzing is the technique of testing software by continuously feeding it inputs that are automatically mutated. For C/C++, the wildly successful afl-fuzz tool by Michał Zalewski uses instrumented source coverage to judge which mutations pushed the program into new paths, eventually hitting many rarely-tested branches.

go-fuzz applies the same technique to Go programs, instrumenting the source by rewriting it (like godebug does). An interesting difference between afl-fuzz and go-fuzz is that the former normally operates on file inputs to unmodified programs, while the latter asks you to write a Go function and passes inputs to that. The former usually forks a new process for each input, the latter keeps calling the function without restarting often.

還沒有人翻譯此段落

我來翻譯

There is no strong technical reason for this difference (and indeed afl recently gained the ability to behave like go-fuzz), but it's likely due to the different ecosystems in which they operate: Go programs often expose well-documented, well-behaved APIs which enable the tester to write a good wrapper that doesn't contaminate state across calls. Also, Go programs are often easier to dive into and more predictable, thanks obviously to GC and memory management, but also to the general community repulsion towards unexpected global states and side effects. On the other hand many legacy C code bases are so intractable that the easy and stable file input interface is worth the performance tradeoff.

Back to our DNS library. RRDNS, our in-house DNS server, uses github.com/miekgs/dns for all its parsing needs, and it has proved to be up to the task. However, it's a bit fragile on the edge cases and has a track record of panicking on malformed packets. Thankfully, this is Go, not BIND C, and we can afford to recover() panics without worrying about ending up with insane memory states. Here's what we are doing

func ParseDNSPacketSafely(buf []byte, msg *old.Msg) (err error) {      defer func() {        panicked := recover()        if panicked != nil {            err = errors.New("ParseError")        }    }()    err = msg.Unpack(buf)    return}

We saw an opportunity to make the library more robust so we wrote this initial simple fuzzing function:

func Fuzz(rawMsg []byte) int {      msg := &dns.Msg{}    if unpackErr := msg.Unpack(rawMsg); unpackErr != nil {        return 0    }    if _, packErr = msg.Pack(); packErr != nil {        println("failed to pack back a message")        spew.Dump(msg)        panic(packErr)    }    return 1}

To create a corpus of initial inputs we took our stress and regression test suites and used github.com/miekg/pcap to write a file per packet.

package mainimport (      "crypto/rand"    "encoding/hex"    "log"    "os"    "strconv"    "github.com/miekg/pcap")func fatalIfErr(err error) {      if err != nil {        log.Fatal(err)    }}func main() {      handle, err := pcap.OpenOffline(os.Args[1])    fatalIfErr(err)    b := make([]byte, 4)    _, err = rand.Read(b)    fatalIfErr(err)    prefix := hex.EncodeToString(b)    i := 0    for pkt := handle.Next(); pkt != nil; pkt = handle.Next() {        pkt.Decode()        f, err := os.Create("p_" + prefix + "_" + strconv.Itoa(i))        fatalIfErr(err)        _, err = f.Write(pkt.Payload)        fatalIfErr(err)        fatalIfErr(f.Close())        i++    }}


(CC BY 2.0 image by JD Hancock)

還沒有人翻譯此段落

我來翻譯

We then compiled our Fuzz function with go-fuzz, and launched the fuzzer on a lab server. The first thing go-fuzz does is minimize the corpus by throwing away packets that trigger the same code paths, then it starts mutating the inputs and passing them to Fuzz() in a loop. The mutations that don't fail (return 1) and expand code coverage are kept and iterated over. When the program panics, a small report (input and output) is saved and the program restarted. If you want to learn more about go-fuzz watch the author's GopherCon talk or read the README.

Crashes, mostly "index out of bounds", started to surface. go-fuzz becomes pretty slow and ineffective when the program crashes often, so while the CPUs burned I started fixing the bugs.

In some cases I just decided to change some parser patterns, for example reslicing and using len() instead of keeping offsets. However these can be potentially disrupting changes—I'm far from perfect—so I adapted the Fuzz function to keep an eye on the differences between the old and new, fixed parser, and crash if the new parser started refusing good packets or changed its behavior:

func Fuzz(rawMsg []byte) int {      var (        msg, msgOld = &dns.Msg{}, &old.Msg{}        buf, bufOld = make([]byte, 100000), make([]byte, 100000)        res, resOld []byte        unpackErr, unpackErrOld error        packErr, packErrOld     error    )    unpackErr = msg.Unpack(rawMsg)    unpackErrOld = ParseDNSPacketSafely(rawMsg, msgOld)    if unpackErr != nil && unpackErrOld != nil {        return 0    }    if unpackErr != nil && unpackErr.Error() == "dns: out of order NSEC block" {        // 97b0a31 - rewrite NSEC bitmap [un]packing to account for out-of-order        return 0    }    if unpackErr != nil && unpackErr.Error() == "dns: bad rdlength" {        // 3157620 - unpackStructValue: drop rdlen, reslice msg instead        return 0    }    if unpackErr != nil && unpackErr.Error() == "dns: bad address family" {        // f37c7ea - Reject a bad EDNS0_SUBNET family on unpack (not only on pack)        return 0    }    if unpackErr != nil && unpackErr.Error() == "dns: bad netmask" {        // 6d5de0a - EDNS0_SUBNET: refactor netmask handling        return 0    }    if unpackErr != nil && unpackErrOld == nil {        println("new code fails to unpack valid packets")        panic(unpackErr)    }    res, packErr = msg.PackBuffer(buf)    if packErr != nil {        println("failed to pack back a message")        spew.Dump(msg)        panic(packErr)    }    if unpackErrOld == nil {        resOld, packErrOld = msgOld.PackBuffer(bufOld)        if packErrOld == nil && !bytes.Equal(res, resOld) {            println("new code changed behavior of valid packets:")            println()            println(hex.Dump(res))            println(hex.Dump(resOld))            os.Exit(1)        }    }    return 1}

I was pretty happy about the robustness gain, but since we used the ParseDNSPacketSafely wrapper in RRDNS I didn't expect to find security vulnerabilities. I was wrong!

還沒有人翻譯此段落

我來翻譯

DNS names are made of labels, usually shown separated by dots. In a space saving effort, labels can be replaced by pointers to other names, so that if we know we encoded example.com at offset 15, www.example.com can be packed as www. + PTR(15). What we found is a bug in handling of pointers to empty names: when encountering the end of a name (0x00), if no label were read, "." (the empty name) was returned as a special case. Problem is that this special case was unaware of pointers, and it would instruct the parser to resume reading from the end of the pointed-to empty name instead of the end of the original name.

For example if the parser encountered at offset 60 a pointer to offset 15, and msg[15] == 0x00, parsing would then resume from offset 16 instead of 61, causing a infinite loop. This is a potential Denial of Service vulnerability.

A) Parse up to position 60, where a DNS name is found| ... |  15  |  16  |  17  | ... |  58  |  59  |  60  |  61  || ... | 0x00 |      |      | ... |      |      | ->15 |      |------------------------------------------------->     B) Follow the pointer to position 15| ... |  15  |  16  |  17  | ... |  58  |  59  |  60  |  61  || ... | 0x00 |      |      | ... |      |      | ->15 |      |         ^                                        |         ------------------------------------------      C) Return a empty name ".", special case triggersD) Erroneously resume from position 16 instead of 61| ... |  15  |  16  |  17  | ... |  58  |  59  |  60  |  61  || ... | 0x00 |      |      | ... |      |      | ->15 |      |                 -------------------------------->   E) Rinse and repeat

We sent the fixes privately to the library maintainer while we patched our servers and we opened a PR once done. (Two bugs were independently found and fixed by Miek while we released our RRDNS updates, as it happens.)

還沒有人翻譯此段落

我來翻譯

Not just crashes and hangs

Thanks to its flexible fuzzing API, go-fuzz lends itself nicely not only to the mere search of crashing inputs, but can be used to explore all scenarios where edge cases are troublesome.

Useful applications range from checking output sanity by adding crashing assertions to your Fuzz() function, to comparing the two ends of a unpack-pack chain and even comparing the behavior of two different versions or implementations of the same functionality.

For example, while preparing our DNSSEC engine for launch, I faced a weird bug that would happen only on production or under stress tests: NSEC records that were supposed to only have a couple bits set in their types bitmap would sometimes look like this

deleg.filippo.io.  IN  NSEC    3600    \000.deleg.filippo.io. NS WKS HINFO TXT AAAA LOC SRV CERT SSHFP RRSIG NSEC TLSA HIP TYPE60 TYPE61 SPF

The catch was that our "pack and send" code pools []byte buffers to reduce GC and allocation churn, so buffers passed to dns.msg.PackBuffer(buf []byte) can be "dirty" from previous uses.

var bufpool = sync.Pool{      New: func() interface{} {        return make([]byte, 0, 2048)    },}[...]    data := bufpool.Get().([]byte)    defer bufpool.Put(data)    if data, err = r.Response.PackBuffer(data); err != nil {

However, buf not being an array of zeroes was not handled by some github.com/miekgs/dns packers, including the NSEC rdata one, that would just OR present bits, without clearing ones that are supposed to be absent.

case `dns:"nsec"`:      lastwindow := uint16(0)    length := uint16(0)    for j := 0; j < val.Field(i).Len(); j++ {        t := uint16((fv.Index(j).Uint()))        window := uint16(t / 256)        if lastwindow != window {            off += int(length) + 3        }        length = (t - window*256) / 8        bit := t - (window * 256) - (length * 8)        msg[off] = byte(window) // window #        msg[off+1] = byte(length + 1) // octets length        // Setting the bit value for the type in the right octet--->    msg[off+2+int(length)] |= byte(1 << (7 - bit))         lastwindow = window    }    off += 2 + int(length)    off++}

The fix was clear and easy: we benchmarked a few different ways to zero a buffer and updated the code like this

// zeroBuf is a big buffer of zero bytes, used to zero out the buffers passed// to PackBuffer.var zeroBuf = make([]byte, 65535)var bufpool = sync.Pool{      New: func() interface{} {        return make([]byte, 0, 2048)    },}[...]    data := bufpool.Get().([]byte)    defer bufpool.Put(data)    copy(data[0:cap(data)], zeroBuf)    if data, err = r.Response.PackBuffer(data); err != nil {

Note: a recent optimization turns zeroing range loops into memclr calls, so once 1.5 lands that will be much faster than copy().

還沒有人翻譯此段落

我來翻譯

But this was a boring fix! Wouldn't it be nicer if we could trust our library to work with any buffer we pass it? Luckily, this is exactly what coverage based fuzzing is good for: making sure all code paths behave in a certain way.

What I did then is write a Fuzz() function that would first parse a message, and then pack it to two different buffers: one filled with zeroes and one filled with 0xff. Any differences between the two results would signal cases where the underlying buffer is leaking into the output.

func Fuzz(rawMsg []byte) int {      var (        msg         = &dns.Msg{}        buf, bufOne = make([]byte, 100000), make([]byte, 100000)        res, resOne []byte        unpackErr, packErr error    )    if unpackErr = msg.Unpack(rawMsg); unpackErr != nil {        return 0    }    if res, packErr = msg.PackBuffer(buf); packErr != nil {        return 0    }    for i := range res {        bufOne[i] = 1    }    resOne, packErr = msg.PackBuffer(bufOne)    if packErr != nil {        println("Pack failed only with a filled buffer")        panic(packErr)    }    if !bytes.Equal(res, resOne) {        println("buffer bits leaked into the packed message")        println(hex.Dump(res))        println(hex.Dump(resOne))        os.Exit(1)    }    return 1}

I wish here, too, I could show a PR fixing all the bugs, but go-fuzz did its job even too well and we are still triaging and fixing what it finds.

Anyway, once the fixes are done and go-fuzz falls silent, we will be free to drop the buffer zeroing step without worry, with no need to audit the whole codebase!

Do you fancy fuzzing the libraries that serve 43 billion queries per day? We are hiring in London, San Francisco and Singapore!

還沒有人翻譯此段落

我來翻譯
本文中的所有譯文僅用於學習和交流目的,轉載請務必註明文章譯者、出處、和本文連結
我們的翻譯工作遵照 CC 協議,如果我們的工作有侵犯到您的權益,請及時聯絡我們
相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.