Bloom Filters

Bloom filters are probabilistic data structures which are used to test whether an element is a member of a set. First introduced in 1970 by Burton Howard Bloom and since then, they have been widely used in applications where memory and computation are constrained.

Here we accept the False positives, but it is 100% accurate about Negatives, so no chance of getting a false negatives

Here where hash functions are used to check whether an element (here string) is present in set or not. Also a new element is added in the set only after some fixed number of search operations.

The basic idea behind a Bloom filter is to use a bit array of length m, and k different hash functions that map elements to positions in the bit array. When an element is inserted into the Bloom filter, each of the k hash functions is applied to the element, and the corresponding bit positions in the bit array are set to 1. To test whether an element is in the set, the same k hash functions are applied to the element, and if all of the corresponding bit positions in the bit array are set to 1, the element is considered to be in the set. However, if any of the bit positions are set to 0, the element is not in the set.

User defined Bloom Filters

Bloom Filter Details
More hash functions more robust
More size more accuracy
Applicable for large systems
When will a new word to be added ?