Rabin-Karp Algorithm for Pattern Searching
Given two strings txt (the text) and pat (the pattern), consisting of lowercase English alphabets, find all 0-based starting indices where pat occurs as a substring in txt.
Examples:
Input: txt = "geeksforgeeks", pat = "geeks"
Output: [0, 8]
Explanation: The string "geeks" occurs at index 0 and 8 in text.Input: txt = "aabaacaadaabaaba", pat= "aaba"
Output: [0, 9, 12]
Explanation:
Table of Content
[Naive Approach] Brute Force Pattern Matching - O(n × m) Time and O(1) Space
The simplest way to solve this problem is to check for the pattern pat at every possible position in the text txt.
- Slide the pattern over the text one character at a time.
- For each position i from 0 to n - m (where n is the length of txt, m is the length of pat), compare the substring txt[i ... i + m - 1] with pat.
- If they match, store index i as an occurrence
[Expected Approach] Rabin-Karp with Single Rolling Hash
In the Naive String Matching algorithm, we check whether every substring of the text of the pattern's size is equal to the pattern or not one by one.
Like the Naive Algorithm, the Rabin-Karp algorithm also check every substring. But unlike the Naive algorithm, the Rabin Karp algorithm matches the hash value of the pattern with the hash value of the current substring of text, and if the hash values match then only it starts matching individual characters. So Rabin Karp algorithm needs to calculate hash values for the following strings.
- Pattern itself
- All the substrings of the text of length m which is the size of pattern.
How is Hash Value calculated in Rabin-Karp?
The hash value in Rabin-Karp is calculated using a rolling hash function, which allows efficient hash updates as the pattern slides over the text. Instead of recalculating the entire hash for each substring, the rolling hash lets us remove the contribution of the old character and add the new one in constant time.
A string is converted into a numeric hash using a polynomial rolling hash. For a string s of length n, the hash is computed as:
=> hash(s) = (s[0] * p(n-1) + s[1] * p(n-2) + ... + s[n-1] * p0 ) %mod
Where:
- s[i] is the numeric value of the i-th character ('a' = 1, 'b' = 2, ..., 'z' = 26)
- p is a small prime number (commonly 31 or 37)
- mod is a large prime number (like 1e9 + 7) to avoid overflow and reduce hash collisions
This approach allows us to compute hash values of substrings in constant time using precomputed powers and prefix hashes.
Hash Recurrence Relation:
Let preHash[i] represent the hash of the prefix substring s[0...i].
Then the recurrence is: preHash[i] = preHash[i - 1] * base + s[i]
Where:
- p[0] = s[0]
- s[i] is the numeric value of the i-th character ('a' = 1, 'b' = 2, ..., 'z' = 26)
- base is a chosen prime number (commonly 31 or 37)
- All operations are done under modulo mod to avoid overflow
How to Compute Substring Hash in O(1):
Since we have computed preHash[] array:
=> preHash[i] → the hash of the prefix s[0...i]
=> power[i] → (p^i) % mod, for all required i
Now to compute the hash of a substring s[l...r] (from index l to r), you use:
hash(s[l...r]) = (preHash[r] - (preHash[l - 1] * power[r - l + 1])) % mod
if l == 0: hash(s[0...r]) = preHash[r]
#include <iostream>
#include <vector>
#include <string>
using namespace std;
const int mod = 1e9 + 7;
const int base = 31;
// Modular addition to keep
// values within mod
int add(int a, int b) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Modular subtraction to prevent
// negative values
int sub(int a, int b) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Modular multiplication for safe base exponentiation
int mul(int a, int b) {
return (int)((1LL * a * b) % mod);
}
// Convert character to numeric value
// ('a' = 1, ..., 'z' = 26)
int charToInt(char c) {
return c - 'a' + 1;
}
// Precomputes prefix hashes and powers for the string
void computeHash(string &s, vector<int> &hash, vector<int> &power) {
int n = s.size();
hash.resize(n);
power.resize(n);
// Initialize hash and power for
// the first character
hash[0] = charToInt(s[0]);
power[0] = 1;
// Build hash and power arrays using
// rolling hash technique
for (int i = 1; i < n; i++) {
hash[i] =
add(mul(hash[i - 1], base), charToInt(s[i]));
power[i] =
mul(power[i - 1], base);
}
}
// Retrieves hash of substring s[l...r] in O(1)
int getSubHash(int l, int r,
vector<int> &hash, vector<int> &power) {
int h = hash[r];
if (l > 0) {
// Subtract prefix hash to isolate [l...r]
h = sub(h, mul(hash[l - 1], power[r - l + 1]));
}
return h;
}
// Rabin-Karp function to find all pattern
// matches in text
vector<int> rabinKarp(string &text,
string &pattern) {
int n = text.size(), m = pattern.size();
vector<int> hashText, powerText;
computeHash(text, hashText, powerText);
vector<int> hashPat, powerPat;
computeHash(pattern, hashPat, powerPat);
// Full pattern hash
int patternHash = hashPat[m - 1];
vector<int> result;
// Slide a window and compare text hash with pattern hash
for (int i = 0; i <= n - m; i++) {
int currentHash = getSubHash(i, i + m - 1,
hashText, powerText);
// If hash matches, add the index to result
if (currentHash == patternHash) {
result.push_back(i);
}
}
return result;
}
int main() {
string txt = "geeksforgeeks";
string pat = "geek";
vector<int> positions =
rabinKarp(txt, pat);
for (int idx : positions) {
cout << idx << " ";
}
cout << endl;
return 0;
}
using namespace std;
const int mod = 1e9 + 7;
const int base = 31;
// Modular addition to keep
// values within mod
int add(int a, int b) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Modular subtraction to prevent
// negative values
int sub(int a, int b) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Modular multiplication for safe base exponentiation
int mul(int a, int b) {
return (int)((1LL * a * b) % mod);
}
// Convert character to numeric value
// ('a' = 1, ..., 'z' = 26)
int charToInt(char c) {
return c - 'a' + 1;
}
// Precomputes prefix hashes and powers for the string
void computeHash(string &s, vector<int> &hash, vector<int> &power) {
int n = s.size();
hash.resize(n);
power.resize(n);
// Initialize hash and power for
// the first character
hash[0] = charToInt(s[0]);
power[0] = 1;
// Build hash and power arrays using
// rolling hash technique
for (int i = 1; i < n; i++) {
hash[i] =
add(mul(hash[i - 1], base), charToInt(s[i]));
power[i] =
mul(power[i - 1], base);
}
}
// Retrieves hash of substring s[l...r] in O(1)
int getSubHash(int l, int r,
vector<int> &hash, vector<int> &power) {
int h = hash[r];
if (l > 0) {
// Subtract prefix hash to isolate [l...r]
h = sub(h, mul(hash[l - 1], power[r - l + 1]));
}
return h;
}
// Rabin-Karp function to find all pattern
// matches in text
vector<int> rabinKarp(string &text,
string &pattern) {
int n = text.size(), m = pattern.size();
vector<int> hashText, powerText;
computeHash(text, hashText, powerText);
vector<int> hashPat, powerPat;
computeHash(pattern, hashPat, powerPat);
// Full pattern hash
int patternHash = hashPat[m - 1];
vector<int> result;
// Slide a window and compare text hash with pattern hash
for (int i = 0; i <= n - m; i++) {
int currentHash = getSubHash(i, i + m - 1,
hashText, powerText);
// If hash matches, add the index to result
if (currentHash == patternHash) {
result.push_back(i);
}
}
return result;
}
int main() {
string txt = "geeksforgeeks";
string pat = "geek";
vector<int> positions =
rabinKarp(txt, pat);
for (int idx : positions) {
cout << idx << " ";
}
cout << endl;
return 0;
}
import java.util.ArrayList;
public class GfG {
static final int mod = 1000000007;
static final int base = 31;
// Modular addition to ensure value
// stays within range
static int add(int a, int b) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Modular subtraction to handle negative
// results correctly
static int sub(int a, int b) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Modular multiplication for safe base exponentiation
static int mul(int a, int b) {
return (int)((1L * a * b) % mod);
}
// Converts character to integer ('a' = 1, ..., 'z' = 26)
static int charToInt(char c) {
return c - 'a' + 1;
}
// Precomputes prefix hashes and powers of
// base for the given string
static void computeHash(String s,
int[] hash, int[] power) {
int n = s.length();
// Initialize hash and power for the first character
hash[0] = charToInt(s.charAt(0));
power[0] = 1;
// Build hash and power arrays using
// rolling hash logic
for (int i = 1; i < n; i++) {
hash[i] = add(mul(hash[i - 1], base),
charToInt(s.charAt(i)));
power[i] = mul(power[i - 1], base);
}
}
// Retrieves the hash of substring s[l...r] in
// O(1) using prefix hashes
static int getSubHash(int l, int r, int[] hash, int[] power) {
int h = hash[r];
if (l > 0) {
// Remove the contribution of prefix before l
h = sub(h, mul(hash[l - 1], power[r - l + 1]));
}
return h;
}
// Rabin-Karp logic to find all starting indices
// of pattern in text
static ArrayList<Integer> rabinKarp(String text, String pattern) {
int n = text.length(), m = pattern.length();
// Precompute hashes and base powers
// for both text and pattern
int[] hashText = new int[n], powerText = new int[n];
computeHash(text, hashText, powerText);
int[] hashPat = new int[m], powerPat = new int[m];
computeHash(pattern, hashPat, powerPat);
// Full pattern hash
int patternHash = hashPat[m - 1];
ArrayList<Integer> result = new ArrayList<>();
// Slide window over text and compare substring
// hash with pattern hash
for (int i = 0; i <= n - m; i++) {
int currentHash =
getSubHash(i, i + m - 1, hashText, powerText);
// If hash matches, store the starting index
if (currentHash == patternHash) {
result.add(i);
}
}
return result;
}
public static void main(String[] args) {
String txt = "geeksforgeeks";
String pat = "geek";
ArrayList<Integer> positions = rabinKarp(txt, pat);
for (int pos : positions) {
System.out.print(pos + " ");
}
System.out.println();
}
}
mod = int(1e9 + 7)
base = 31
# Modular addition to avoid overflow and keep
# values within range
def add(a, b):
a += b
if a >= mod:
a -= mod
return a
# Modular subtraction to handle negative results correctly
def sub(a, b):
a -= b
if a < 0:
a += mod
return a
# Modular multiplication for safe exponentiation
# and base scaling
def mul(a, b):
return (a * b) % mod
# Converts a character to a numeric value
# ('a' = 1, ..., 'z' = 26)
def charToInt(c):
return ord(c) - ord('a') + 1
# Precomputes prefix hashes and powers of
# base for a given string
def computeHash(s):
n = len(s)
hash = [0] * n
power = [0] * n
# Initialize the first character's hash and base power
hash[0] = charToInt(s[0])
power[0] = 1
# Rolling hash: hash[i] = hash[i-1]*base + s[i]
for i in range(1, n):
hash[i] = \
add(mul(hash[i - 1], base), charToInt(s[i]))
power[i] = mul(power[i - 1], base)
return hash, power
# Retrieves hash of substring s[l...r] in O(1)
# using prefix hashes
def getSubHash(l, r, hash, power):
h = hash[r]
if l > 0:
# Subtract the contribution of prefix before l
h = sub(h, mul(hash[l - 1], power[r - l + 1]))
return h
# Main Rabin-Karp function to find all
# pattern matches in the text
def rabinKarp(text, pattern):
n, m = len(text), len(pattern)
# Precompute hashes and base powers for
# both text and pattern
hashText, powerText = computeHash(text)
hashPat, powerPat = computeHash(pattern)
# Full pattern hash
patternHash = hashPat[m - 1]
result = []
# Slide a window of size m over text
# and compare hashes
for i in range(n - m + 1):
currentHash = \
getSubHash(i, i + m - 1, hashText, powerText)
# If hash matches, record the starting index
if currentHash == patternHash:
result.append(i)
return result
if __name__ == "__main__":
txt = "geeksforgeeks"
pat = "geek"
positions = rabinKarp(txt, pat)
print(*positions)
using System;
using System.Collections.Generic;
class GfG {
const int mod = 1000000007;
const int baseVal = 31;
// Modular addition to keep values within bounds
static int add(int a, int b) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Modular subtraction to ensure non-negative results
static int sub(int a, int b) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Modular multiplication for safe base exponentiation
static int mul(int a, int b) {
return (int)((long)a * b % mod);
}
// Convert character to integer ('a' = 1, ..., 'z' = 26)
static int charToInt(char c) {
return c - 'a' + 1;
}
// Precomputes prefix hashes and base
// powers for the given string
static void computeHash(string s, int[] hash, int[] power) {
int n = s.Length;
// Initialize hash and power for the first character
hash[0] = charToInt(s[0]);
power[0] = 1;
// Build hash and power arrays using
// rolling hash formula
for (int i = 1; i < n; i++) {
hash[i] =
add(mul(hash[i - 1], baseVal), charToInt(s[i]));
power[i] =
mul(power[i - 1], baseVal);
}
}
// Retrieves the hash of substring s[l...r]
// in O(1) using prefix hashes
static int getSubHash(int l, int r, int[] hash, int[] power) {
int h = hash[r];
if (l > 0) {
// Remove the contribution of characters before l
h = sub(h, mul(hash[l - 1], power[r - l + 1]));
}
return h;
}
// Rabin-Karp main logic to find all
// pattern matches in the text
static List<int> rabinKarp(string text, string pattern) {
int n = text.Length, m = pattern.Length;
// Precompute hash and power arrays for the text
int[] hashText = new int[n], powerText = new int[n];
computeHash(text, hashText, powerText);
// Precompute hash of the entire pattern
int[] hashPat = new int[m], powerPat = new int[m];
computeHash(pattern, hashPat, powerPat);
int patternHash = hashPat[m - 1];
List<int> result = new List<int>();
// Slide window over text and compare hashes
// with pattern hash
for (int i = 0; i <= n - m; i++) {
int currentHash =
getSubHash(i, i + m - 1, hashText, powerText);
// If hashes match, record the starting index
if (currentHash == patternHash) {
result.Add(i);
}
}
return result;
}
static void Main() {
string txt = "geeksforgeeks";
string pat = "geek";
var positions = rabinKarp(txt, pat);
foreach (var pos in positions) {
Console.Write(pos + " ");
}
Console.WriteLine();
}
}
const mod = 1e9 + 7;
const base = 31;
// Modular addition to avoid overflow
// and keep within mod
function add(a, b) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Modular subtraction to handle
// negative results correctly
function sub(a, b) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Modular multiplication for
// safe base-power computations
function mul(a, b) {
return (a * b) % mod;
}
// Converts character to numeric value
// ('a' = 1, ..., 'z' = 26)
function charToInt(c) {
return c.charCodeAt(0) - 'a'.charCodeAt(0) + 1;
}
// Precomputes prefix hashes and powers of
// base for the given string
function computeHash(s) {
const n = s.length;
let hash = new Array(n).fill(0);
let power = new Array(n).fill(0);
// Initialize the first character's
// hash and base power
hash[0] = charToInt(s[0]);
power[0] = 1;
// Compute rolling hash and power values
// for each position
for (let i = 1; i < n; i++) {
hash[i] = add(mul(hash[i - 1], base), charToInt(s[i]));
power[i] = mul(power[i - 1], base);
}
return [hash, power];
}
// Retrieves hash of substring s[l...r] in O(1)
// using prefix hashes
function getSubHash(l, r, hash, power) {
let h = hash[r];
if (l > 0) {
// Remove the contribution of prefix before l
h = sub(h, mul(hash[l - 1], power[r - l + 1]));
}
return h;
}
// Main Rabin-Karp logic to find all matching indices
function rabinKarp(text, pattern) {
const n = text.length, m = pattern.length;
// Precompute hashes and powers for both text and pattern
const [hashText, powerText] =
computeHash(text);
const [hashPat, powerPat] =
computeHash(pattern);
// Get the hash value of the entire pattern
const patternHash = hashPat[m - 1];
const result = [];
// Slide the window over the text and compare hash values
for (let i = 0; i <= n - m; i++) {
const currentHash =
getSubHash(i, i + m - 1, hashText, powerText);
// If hash matches, record the index
if (currentHash === patternHash) {
result.push(i);
}
}
return result;
}
// Driver Code
let txt = "geeksforgeeks";
let pat = "geek";
console.log(...rabinKarp(txt, pat));
Output
0 8
Time Complexity: O(n + m), we compute prefix hashes and powers for both text and pattern in O(n + m). Then, we slide a window over the text, and each substring hash is compared in O(1).
Auxiliary Space: O(n + m) ,we store prefix hashes and power arrays for both text and pattern, taking O(n + m) space. Additionally, we use O(k) space for the result where k is the number of matches (bounded by O(n)).
[Efficient Approach] Rabin-Karp with Double Hashing
The idea behind double hashing is to reduce the probability of hash collisions by computing two hashes with different bases and moduli, and only considering a match if both hashes match.
Why Single Hash Can Fail (Hash Collisions):
When using a single hash function, there's always a chance that two different substrings may produce the same hash value. This is called a hash collision.
Why does it happen?
=> We're computing hashes modulo a large number (e.g., 10^9 + 7)
=> But since the number of possible substrings is huge, different substrings might accidentally result in the same hash after taking the modulo.
Consequence: If two different substrings have the same hash, the algorithm may falsely report a match (false positive). Rabin-Karp, in such a case, needs to verify the actual characters to confirm a match — which slows down performance.
Need for Double Hashing:
To reduce the probability of collisions, we use double hashing — i.e., compute two independent hashes with different: Base values (p1, p2) and Moduli (mod1, mod2)
How it helps:
=> Now, two substrings are considered equal only if both hash values match.
=> The probability of two different substrings colliding in both hash functions is extremely low — roughly 1/(mod1 x mod2), which is practically negligible.
#include <iostream>
#include <vector>
#include <string>
using namespace std;
const int mod1 = 1e9 + 7;
const int mod2 = 1e9 + 9;
const int base1 = 31;
const int base2 = 37;
// Performs modular addition
int add(int a, int b, int mod) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Performs modular subtraction
int sub(int a, int b, int mod) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Performs modular multiplication
int mul(int a, int b, int mod) {
return (int)((1LL * a * b) % mod);
}
// Converts character to a numeric
// value ('a' = 1, ..., 'z' = 26)
int charToInt(char c) {
return c - 'a' + 1;
}
// Precomputes prefix hashes and power arrays for a given string
void computeHashes(string &s, vector<int> &hash1, vector<int> &hash2,
vector<int> &power1, vector<int> &power2) {
int n = s.size();
hash1.resize(n);
hash2.resize(n);
power1.resize(n);
power2.resize(n);
// Base case
hash1[0] = hash2[0] = charToInt(s[0]);
power1[0] = power2[0] = 1;
for (int i = 1; i < n; i++) {
// hash[i] = hash[i-1] * base + s[i]
hash1[i] =
add(mul(hash1[i - 1], base1, mod1), charToInt(s[i]), mod1);
hash2[i] =
add(mul(hash2[i - 1], base2, mod2), charToInt(s[i]), mod2);
// power[i] = base^i
power1[i] = mul(power1[i - 1], base1, mod1);
power2[i] = mul(power2[i - 1], base2, mod2);
}
}
// Returns the hash of substring s[l...r] using precomputed
// prefix hashes and powers
pair<int, int> getSubHash(int l, int r,
vector<int> &hash1, vector<int> &hash2,
vector<int> &power1, vector<int> &power2) {
int h1 = hash1[r];
int h2 = hash2[r];
// If starting index is not 0, subtract
// the hash of the previous prefix
if (l > 0) {
h1 = sub(h1, mul(hash1[l - 1], power1[r - l + 1], mod1), mod1);
h2 = sub(h2, mul(hash2[l - 1], power2[r - l + 1], mod2), mod2);
}
return {h1, h2};
}
// Rabin-Karp main function: finds all occurrences of pattern in text
vector<int> rabinKarp(string &text,string &pattern) {
int n = text.size(), m = pattern.size();
vector<int> hashText1, hashText2, powerText1, powerText2;
// Precompute hashes and powers for text
computeHashes(text, hashText1, hashText2, powerText1, powerText2);
// Precompute full hash of pattern
vector<int> hashPat1, hashPat2, powerPat1, powerPat2;
computeHashes(pattern, hashPat1, hashPat2, powerPat1, powerPat2);
pair<int, int> patternHash = {hashPat1[m - 1], hashPat2[m - 1]};
vector<int> result;
// Slide the pattern over the text and compare hashes
for (int i = 0; i <= n - m; i++) {
pair<int, int> currentHash =
getSubHash(i, i + m - 1, hashText1,
hashText2, powerText1, powerText2);
// Match found at index i
if (currentHash == patternHash) {
result.push_back(i);
}
}
return result;
}
int main() {
string txt = "geeksforgeeks";
string pat = "geek";
vector<int> positions = rabinKarp(txt, pat);
for (int idx : positions) {
cout << idx << " ";
}
cout << endl;
return 0;
}
using namespace std;
const int mod1 = 1e9 + 7;
const int mod2 = 1e9 + 9;
const int base1 = 31;
const int base2 = 37;
// Performs modular addition
int add(int a, int b, int mod) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Performs modular subtraction
int sub(int a, int b, int mod) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Performs modular multiplication
int mul(int a, int b, int mod) {
return (int)((1LL * a * b) % mod);
}
// Converts character to a numeric
// value ('a' = 1, ..., 'z' = 26)
int charToInt(char c) {
return c - 'a' + 1;
}
// Precomputes prefix hashes and power arrays for a given string
void computeHashes(string &s, vector<int> &hash1, vector<int> &hash2,
vector<int> &power1, vector<int> &power2) {
int n = s.size();
hash1.resize(n);
hash2.resize(n);
power1.resize(n);
power2.resize(n);
// Base case
hash1[0] = hash2[0] = charToInt(s[0]);
power1[0] = power2[0] = 1;
for (int i = 1; i < n; i++) {
// hash[i] = hash[i-1] * base + s[i]
hash1[i] =
add(mul(hash1[i - 1], base1, mod1), charToInt(s[i]), mod1);
hash2[i] =
add(mul(hash2[i - 1], base2, mod2), charToInt(s[i]), mod2);
// power[i] = base^i
power1[i] = mul(power1[i - 1], base1, mod1);
power2[i] = mul(power2[i - 1], base2, mod2);
}
}
// Returns the hash of substring s[l...r] using precomputed
// prefix hashes and powers
pair<int, int> getSubHash(int l, int r,
vector<int> &hash1, vector<int> &hash2,
vector<int> &power1, vector<int> &power2) {
int h1 = hash1[r];
int h2 = hash2[r];
// If starting index is not 0, subtract
// the hash of the previous prefix
if (l > 0) {
h1 = sub(h1, mul(hash1[l - 1], power1[r - l + 1], mod1), mod1);
h2 = sub(h2, mul(hash2[l - 1], power2[r - l + 1], mod2), mod2);
}
return {h1, h2};
}
// Rabin-Karp main function: finds all occurrences of pattern in text
vector<int> rabinKarp(string &text,string &pattern) {
int n = text.size(), m = pattern.size();
vector<int> hashText1, hashText2, powerText1, powerText2;
// Precompute hashes and powers for text
computeHashes(text, hashText1, hashText2, powerText1, powerText2);
// Precompute full hash of pattern
vector<int> hashPat1, hashPat2, powerPat1, powerPat2;
computeHashes(pattern, hashPat1, hashPat2, powerPat1, powerPat2);
pair<int, int> patternHash = {hashPat1[m - 1], hashPat2[m - 1]};
vector<int> result;
// Slide the pattern over the text and compare hashes
for (int i = 0; i <= n - m; i++) {
pair<int, int> currentHash =
getSubHash(i, i + m - 1, hashText1,
hashText2, powerText1, powerText2);
// Match found at index i
if (currentHash == patternHash) {
result.push_back(i);
}
}
return result;
}
int main() {
string txt = "geeksforgeeks";
string pat = "geek";
vector<int> positions = rabinKarp(txt, pat);
for (int idx : positions) {
cout << idx << " ";
}
cout << endl;
return 0;
}
import java.util.ArrayList;
public class GfG {
static final int mod1 = 1000000007;
static final int mod2 = 1000000009;
static final int base1 = 31;
static final int base2 = 37;
// Performs modular addition
static int add(int a, int b, int mod) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Performs modular subtraction
static int sub(int a, int b, int mod) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Performs modular multiplication
static int mul(int a, int b, int mod) {
return (int)((1L * a * b) % mod);
}
// Converts character to a numeric
// value ('a' = 1, ..., 'z' = 26)
static int charToInt(char c) {
return c - 'a' + 1;
}
// Precomputes prefix hashes and power arrays for a given string
static void computeHashes(String s,int[] hash1,int[] hash2,
int[] power1, int[] power2) {
int n = s.length();
hash1[0] = hash2[0] = charToInt(s.charAt(0));
power1[0] = power2[0] = 1;
for (int i = 1; i < n; i++) {
hash1[i] = add(mul(hash1[i - 1], base1, mod1),
charToInt(s.charAt(i)), mod1);
hash2[i] = add(mul(hash2[i - 1], base2, mod2),
charToInt(s.charAt(i)), mod2);
power1[i] = mul(power1[i - 1], base1, mod1);
power2[i] = mul(power2[i - 1], base2, mod2);
}
}
// Returns the hash of substring s[l...r] using precomputed
// prefix hashes and powers
static int[] getSubHash(int l, int r, int[] hash1, int[] hash2,
int[] power1, int[] power2) {
int h1 = hash1[r];
int h2 = hash2[r];
// If starting index is not 0, subtract
// the hash of the previous prefix
if (l > 0) {
h1 = sub(h1, mul(hash1[l - 1],
power1[r - l + 1], mod1), mod1);
h2 = sub(h2, mul(hash2[l - 1],
power2[r - l + 1], mod2), mod2);
}
return new int[]{h1, h2};
}
static ArrayList<Integer> rabinKarp(String text, String pattern) {
int n = text.length(), m = pattern.length();
int[] hashText1 = new int[n], hashText2 = new int[n];
int[] powerText1 = new int[n], powerText2 = new int[n];
computeHashes(text, hashText1, hashText2, powerText1, powerText2);
int[] hashPat1 = new int[m], hashPat2 = new int[m];
int[] powerPat1 = new int[m], powerPat2 = new int[m];
computeHashes(pattern, hashPat1, hashPat2, powerPat1, powerPat2);
int[] patternHash = {hashPat1[m - 1], hashPat2[m - 1]};
ArrayList<Integer> result = new ArrayList<>();
// Slide the pattern over the text and compare hashes
for (int i = 0; i <= n - m; i++) {
int[] currentHash =
getSubHash(i, i + m - 1, hashText1,
hashText2, powerText1, powerText2);
// Match found at index i
if (currentHash[0] == patternHash[0]
&& currentHash[1] == patternHash[1]) {
result.add(i);
}
}
return result;
}
public static void main(String[] args) {
String txt = "geeksforgeeks";
String pat = "geek";
ArrayList<Integer> positions = rabinKarp(txt, pat);
for (int idx : positions) {
System.out.print(idx + " ");
}
System.out.println();
}
}
mod1 = int(1e9 + 7)
mod2 = int(1e9 + 9)
base1 = 31
base2 = 37
# Performs modular addition
def add(a, b, mod):
a += b
if a >= mod:
a -= mod
return a
# Performs modular subtraction
def sub(a, b, mod):
a -= b
if a < 0:
a += mod
return a
# Performs modular multiplication
def mul(a, b, mod):
return (a * b) % mod
# Converts character to a numeric
# value ('a' = 1, ..., 'z' = 26)
def charToInt(c):
return ord(c) - ord('a') + 1
# Precomputes prefix hashes and power arrays
# for a given string
def computeHashes(s):
n = len(s)
hash1 = [0] * n
hash2 = [0] * n
power1 = [0] * n
power2 = [0] * n
hash1[0] = hash2[0] = charToInt(s[0])
power1[0] = power2[0] = 1
for i in range(1, n):
hash1[i] = add(mul(hash1[i - 1], base1, mod1),
charToInt(s[i]), mod1)
hash2[i] = add(mul(hash2[i - 1], base2, mod2),
charToInt(s[i]), mod2)
power1[i] = mul(power1[i - 1], base1, mod1)
power2[i] = mul(power2[i - 1], base2, mod2)
return hash1, hash2, power1, power2
# Returns the hash of substring s[l...r] using precomputed
# prefix hashes and powers
def getSubHash(l, r, hash1, hash2, power1, power2):
h1 = hash1[r]
h2 = hash2[r]
if l > 0:
h1 = \
sub(h1, mul(hash1[l - 1], power1[r - l + 1], mod1), mod1)
h2 = \
sub(h2, mul(hash2[l - 1], power2[r - l + 1], mod2), mod2)
return (h1, h2)
def rabinKarp(text, pattern):
n, m = len(text), len(pattern)
hashText1, hashText2, powerText1, powerText2 = computeHashes(text)
hashPat1, hashPat2, powerPat1, powerPat2 = computeHashes(pattern)
patternHash = (hashPat1[m - 1], hashPat2[m - 1])
result = []
for i in range(n - m + 1):
currentHash = getSubHash(i, i + m - 1,
hashText1, hashText2, powerText1, powerText2)
if currentHash == patternHash:
result.append(i) # Match found at index i
return result
if __name__ == "__main__":
txt = "geeksforgeeks"
pat = "geek"
positions = rabinKarp(txt, pat)
print(*positions)
using System;
using System.Collections.Generic;
class GfG {
const int mod1 = 1000000007;
const int mod2 = 1000000009;
const int base1 = 31;
const int base2 = 37;
// Performs modular addition
static int add(int a, int b, int mod) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Performs modular subtraction
static int sub(int a, int b, int mod) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Performs modular multiplication
static int mul(int a, int b, int mod) {
return (int)((long)a * b % mod);
}
// Converts character to a numeric value ('a' = 1, ..., 'z' = 26)
static int charToInt(char c) {
return c - 'a' + 1;
}
// Precomputes prefix hashes and power arrays
// for a given string
static void computeHashes(string s, int[] hash1, int[] hash2,
int[] power1, int[] power2) {
int n = s.Length;
hash1[0] = hash2[0] = charToInt(s[0]);
power1[0] = power2[0] = 1;
for (int i = 1; i < n; i++) {
hash1[i] = add(mul(hash1[i - 1], base1, mod1),
charToInt(s[i]), mod1);
hash2[i] = add(mul(hash2[i - 1], base2, mod2),
charToInt(s[i]), mod2);
power1[i] = mul(power1[i - 1], base1, mod1);
power2[i] = mul(power2[i - 1], base2, mod2);
}
}
static Tuple<int, int> getSubHash(int l, int r, int[] hash1, int[] hash2,
int[] power1, int[] power2) {
int h1 = hash1[r], h2 = hash2[r];
if (l > 0) {
h1 = sub(h1, mul(hash1[l - 1], power1[r - l + 1], mod1), mod1);
h2 = sub(h2, mul(hash2[l - 1], power2[r - l + 1], mod2), mod2);
}
return Tuple.Create(h1, h2);
}
static List<int> rabinKarp(string text, string pattern) {
int n = text.Length, m = pattern.Length;
int[] hashText1 = new int[n], hashText2 = new int[n];
int[] powerText1 = new int[n], powerText2 = new int[n];
computeHashes(text, hashText1, hashText2, powerText1, powerText2);
int[] hashPat1 = new int[m], hashPat2 = new int[m];
int[] powerPat1 = new int[m], powerPat2 = new int[m];
computeHashes(pattern, hashPat1, hashPat2, powerPat1, powerPat2);
var patternHash = Tuple.Create(hashPat1[m - 1], hashPat2[m - 1]);
List<int> result = new List<int>();
for (int i = 0; i <= n - m; i++) {
var currentHash =
getSubHash(i, i + m - 1, hashText1, hashText2,
powerText1, powerText2);
if (currentHash.Equals(patternHash)) {
result.Add(i); // Match found at index i
}
}
return result;
}
static void Main() {
string txt = "geeksforgeeks";
string pat = "geek";
var positions = rabinKarp(txt, pat);
foreach (var pos in positions)
Console.Write(pos + " ");
Console.WriteLine();
}
}
const mod1 = 1e9 + 7;
const mod2 = 1e9 + 9;
const base1 = 31;
const base2 = 37;
// Performs modular addition
function add(a, b, mod) {
a += b;
if (a >= mod) a -= mod;
return a;
}
// Performs modular subtraction
function sub(a, b, mod) {
a -= b;
if (a < 0) a += mod;
return a;
}
// Performs modular multiplication
function mul(a, b, mod) {
return (a * b) % mod;
}
// Converts character to numeric value ('a' = 1, ..., 'z' = 26)
function charToInt(c) {
return c.charCodeAt(0) - 'a'.charCodeAt(0) + 1;
}
// Precomputes prefix hashes and power arrays
function computeHashes(s) {
let n = s.length;
let hash1 = new Array(n), hash2 = new Array(n);
let power1 = new Array(n), power2 = new Array(n);
hash1[0] = hash2[0] = charToInt(s[0]);
power1[0] = power2[0] = 1;
for (let i = 1; i < n; i++) {
hash1[i] =
add(mul(hash1[i - 1], base1, mod1), charToInt(s[i]), mod1);
hash2[i] =
add(mul(hash2[i - 1], base2, mod2), charToInt(s[i]), mod2);
power1[i] = mul(power1[i - 1], base1, mod1);
power2[i] = mul(power2[i - 1], base2, mod2);
}
return [hash1, hash2, power1, power2];
}
// Returns hash of substring s[l...r]
function getSubHash(l, r, hash1, hash2, power1, power2) {
let h1 = hash1[r], h2 = hash2[r];
if (l > 0) {
h1 = sub(h1, mul(hash1[l - 1],
power1[r - l + 1], mod1), mod1);
h2 = sub(h2, mul(hash2[l - 1],
power2[r - l + 1], mod2), mod2);
}
return [h1, h2];
}
function rabinKarp(text, pattern) {
const n = text.length, m = pattern.length;
let [hashText1, hashText2, powerText1, powerText2] =
computeHashes(text);
let [hashPat1, hashPat2] = computeHashes(pattern);
let patternHash = [hashPat1[m - 1], hashPat2[m - 1]];
let result = [];
for (let i = 0; i <= n - m; i++) {
let [h1, h2] = getSubHash(i, i + m - 1, hashText1,
hashText2, powerText1, powerText2);
if (h1 === patternHash[0] && h2 === patternHash[1]) {
result.push(i); // Match found at index i
}
}
return result;
}
// Driver Code
let txt = "geeksforgeeks";
let pat = "geek";
let result = rabinKarp(txt, pat);
console.log(...result);
Output
0 8
Time Complexity: O(n + m), we compute prefix hashes and powers for both text and pattern in O(n + m). Then, we slide a window over the text, and each substring hash is compared in O(1).
Auxiliary Space: O(n + m) ,we store prefix hashes and power arrays for both text and pattern, taking O(n + m) space. Additionally, we use O(k) space for the result where k is the number of matches (bounded by O(n)).
Limitations of Rabin-Karp Algorithm
When the hash value of the pattern matches with the hash value of a window of the text but the window is not the actual pattern then it is called a spurious hit. Spurious hit increases the time complexity of the algorithm. In order to minimize spurious hit, we use good hash function. It greatly reduces the spurious hit.
Related Articles:
Searching for Patterns | Set 1 (Naive Pattern Searching)
Searching for Patterns | Set 2 (KMP Algorithm)