Directory
- LD Sketch
- Seqhash
- What ' s New
- Reversible sketch
- Count-sketch and Count-min Sketch
- Diamond sketch:accurate Per-flow Measurement
- Finding top-k elements in data streams
- Appendix
- Bloom Filter
- Quotient Filter and Cascade filter
- Summarize
LD Sketch
- Application: in the network stream
- Anomaly detection
- Heavy Hitters detection
- Heavy changers Detection
- Benefits: Accuracy, scalability
- Characteristics:
- Leveraging count-and sketch-based technologies
- Parallel schemas (merge distributed streams)
- Divided into local detection and distribution detection
- by two heuristic enhancement methods
- [1]
Seqhash
- Application:
- Intrusion prevention
- Large stream detection
- Heavy Hitters/changers Recovery
- Advantages: Fast and accurate, small resource overhead (only slightly larger than the theoretical value)
- [2]
What ' s New
Find absolute, relative, and variable differences between traffic.
- Use sketch to record traffic
- Advantages:
- Quickly
- Small space Overhead
- [3]
Reversible sketch
Traffic change detection, anomaly detection can not save the Traffic key information (IP etc), it is difficult to recover abnormal traffic off. Key information for push-off
- Characteristics:
- With a small memory overhead, the packet information is recorded,
- Determine the flow of the change (exception), and the key information for the stream
- [6]
Count-sketch and Count-min Sketch
- have similar performance
- Application: Statistics of high-speed streams
- Advantages
- Small space Overhead
- Faster than fast
- [7]
Diamond sketch:accurate Per-flow Measurement
For Real IP Streams
- For skewed IP streaming, sketch's measurement space is inefficient, and the Diamond sketch dynamically assigns sketch to each stream.
- Advantages: Improves the accuracy of the measurement and maintains a certain speed.
- [8]
Finding top-k elements in data streams
- Application: Detecting the most common elements in a data flow
- Advantages:
- Small space Overhead
- Fast speed
- [9]
Appendix Bloom Filter
- Bloom Filter (BF) is a spatial efficient random data structure that uses bit arrays to represent a collection very succinctly
- history : Bloom-filter, the Bron filter, was introduced in 1970 by Bloom.
- apply : Used to retrieve whether an element is in a collection.
- features : Bloom filter may be wrong to judge, but will not miss the judgment.
- applicable scenario : Bloom Filter "is not suitable for those" 0 error applications. In applications where low error rates are tolerated, Bloom filter greatly saves space compared to other common algorithms such as hash, binary lookup.
- Advantages : space efficiency and query time are far more than the general algorithm,
- disadvantage : There is a certain rate of error recognition and removal difficulties.
- More detailed information, visible [10][11]
Quotient Filter and Cascade filter
- Quitient Filter and Cascade filter algorithm is designed by Bender and other people, and is a probabilistic data structure with high spatial efficiency.
- apply : Used to retrieve whether an element is in a collection.
- Advantage : For INSERT, query, delete operation by high throughput, two orders of magnitude higher than bloom filter.
- See [12][13] for more details.
Summarize
- Sketch-based methods are counted/statistically dominant, often used for large flow/anomaly traffic detection, and the key information of the package can be recovered based on the measured results.
- Key Benefits:
- Space-saving Resources
- Faster than fast
- Main disadvantages:
- Not accurate
- Higher computational overhead
Reference documents:
[1] A hybrid local and distributed sketching design for accurate and scalable heavy key detection in network data streams
[2] Sequential hashing:a flexible approach for unveiling significant patterns in high speed networks
[3] What's new:finding significant Differences in Network Data Streams
[6] Reversible sketches:enabling monitoring and analysis over high-speed Data Streams
[7] An improved data stream summary:the Count-min sketch and its applications
[8] Diamond sketch:accurate per-flow measurement
[9] Finding top-k elements in data streams
[Ten] Https://www.cnblogs.com/zhxshseu/p/5289871.html
[One] Https://en.wikipedia.org/wiki/Bloom_filter
[Https://en.wikipedia.org/wiki/Quotient_filter]
[] Don ' t thrash:how to Cache your Hash on Flash
Survey based on sketch method in network measurement