Company Corner

Program to Find Frequency of Characters in a String

Count occurrences of each character in a string using array and hashmap approaches in C, C++, Java, and Python with full code and complexity analysis.

By FACE Prep Team 6 min read
string-manipulation character-frequency placement-coding hashmap arrays

Counting how often each character appears in a string is one pass through the input and one data structure choice.

How Character Frequency Counting Works

The core logic is simple. Walk through the string one character at a time. For each character, increment its counter. At the end, every counter holds that character’s frequency. The entire algorithm fits in three lines of pseudocode, which is part of why interviewers use it as a building block rather than a standalone question.

The design choice is where you store those counters:

  • Fixed-size array (256 slots): use the character’s ASCII value as the index directly. freq[ch]++ is a single memory write with no hashing.
  • Hashmap or dictionary: use the character itself as a key. Handles any character set (Unicode included) but pays the cost of hash computation and possible collision resolution.

Complexity comparison

ApproachTimeSpaceUnicode support
256-element arrayO(n)O(1) fixed at 256 intsNo (ASCII only)
Hashmap / dictionaryO(n) amortisedO(k) where k = unique charsYes

Both are linear in the length of the string. The array approach is faster in practice for ASCII inputs because it replaces hash computation with a single integer-indexed memory access.

A third approach exists but performs worse: for each of the 256 possible ASCII characters, scan the entire string counting occurrences. That nested loop runs in O(256 * n) time. Still technically linear (256 is a constant), but it performs 256 full passes instead of one. Every solution below uses the single-pass method.

Trace through “google”

  • Start: all 256 slots at 0.
  • g (ASCII 103): freq[103] becomes 1.
  • o (ASCII 111): freq[111] becomes 1.
  • o: freq[111] becomes 2.
  • g: freq[103] becomes 2.
  • l (ASCII 108): freq[108] becomes 1.
  • e (ASCII 101): freq[101] becomes 1.
  • Result: e:1, g:2, l:1, o:2.

Code: Array-Based Approach (C and C++)

C implementation

#include <stdio.h>
#include <string.h>

void charFrequency(const char *str) {
    int freq[256] = {0};
    int len = strlen(str);

    for (int i = 0; i < len; i++)
        freq[(unsigned char)str[i]]++;

    for (int i = 0; i < 256; i++) {
        if (freq[i] > 0)
            printf("%c : %d\n", i, freq[i]);
    }
}

int main() {
    char str[] = "google";
    charFrequency(str);
    return 0;
}

C++ implementation

#include <iostream>
#include <cstring>
using namespace std;

void charFrequency(const char *str) {
    int freq[256] = {0};
    int len = strlen(str);

    for (int i = 0; i < len; i++)
        freq[(unsigned char)str[i]]++;

    for (int i = 0; i < 256; i++) {
        if (freq[i] > 0)
            cout << (char)i << " : " << freq[i] << endl;
    }
}

int main() {
    char str[] = "google";
    charFrequency(str);
    return 0;
}

The cast to unsigned char matters. Plain char is signed on most compilers, so characters with ASCII values above 127 would produce negative indices without the cast. This is a subtle bug that interviewers notice. If your input is guaranteed to be lowercase English letters only, you can simplify to freq[str[i] - 'a']++ with an array of 26 instead of 256.

For problems requiring per-character lookup beyond simple frequency (like checking membership in a set), C++ offers std::unordered_map as the hashmap equivalent. For pure frequency counting on ASCII strings, the fixed array wins on simplicity and speed.

Code: Hashmap Approach (Java and Python)

Java implementation

import java.util.HashMap;

public class CharFrequency {
    public static void main(String[] args) {
        String str = "google";
        HashMap<Character, Integer> freq = new HashMap<>();

        for (char ch : str.toCharArray()) {
            freq.put(ch, freq.getOrDefault(ch, 0) + 1);
        }

        for (var entry : freq.entrySet()) {
            System.out.println(entry.getKey() + " : " + entry.getValue());
        }
    }
}

Python implementation

from collections import Counter

def char_frequency(s):
    freq = Counter(s)
    for ch, count in freq.items():
        print(f"{ch} : {count}")

char_frequency("google")

Python’s collections.Counter does the single-pass counting internally and returns a dictionary subclass. If you want the manual version (useful for interviews where interviewers want to see the logic):

def char_frequency_manual(s):
    freq = {}
    for ch in s:
        freq[ch] = freq.get(ch, 0) + 1
    for ch, count in freq.items():
        print(f"{ch} : {count}")

Both produce identical output. The manual version maps directly to how you would explain the algorithm on a whiteboard. In coding interviews, writing the manual version shows the interviewer you understand the mechanism. Use Counter in production code and timed contests where speed of writing matters more than demonstrating understanding.

Edge Cases and Design Decisions

Empty string

Both approaches handle this gracefully. The loop body never executes, and no output is printed. No special-case check needed. Interviewers sometimes test this explicitly: “What does your function do for empty input?” The answer is “nothing prints, no crash, no special case” which is exactly what a single-pass loop gives you for free.

All identical characters

Input "aaaa" produces a : 4. The array slot or hashmap key gets incremented four times. Correct by construction. This is another interviewer probe. If you used a nested loop that starts a new count for every distinct character already seen, you might print a : 4 four times. The single-pass approach avoids that entirely.

Case sensitivity

By default, 'A' (ASCII 65) and 'a' (ASCII 97) occupy different slots. If you need case-insensitive counting, convert to one case before indexing:

  • C/C++: freq[(unsigned char)tolower(str[i])]++
  • Java: freq.put(Character.toLowerCase(ch), ...)
  • Python: Counter(s.lower())

ASCII vs. Unicode

The 256-element array covers extended ASCII (0 to 255). For strings containing Devanagari, Tamil, or emoji code points, you need a hashmap. In Java and Python, the hashmap approach handles Unicode natively since Character and Python str already support full code points. In C/C++, you would need to parse UTF-8 multi-byte sequences or use wide-character types (wchar_t), which adds complexity.

For placement coding rounds, interviewers typically specify ASCII input unless stated otherwise. If the problem statement says “given a string of lowercase English letters”, use a 26-element array indexed by ch - 'a'. If it says “given a string of printable characters”, the 256-element array works. If no constraint is mentioned, ask. That question alone signals to the interviewer that you think about edge cases before writing code.

Where This Shows Up in Placement Rounds

Character frequency is rarely the final answer in an interview. It is the preprocessing step for harder problems:

  • Anagram detection: two strings are anagrams if and only if their character frequency arrays are identical.
  • First non-repeating character: build the frequency array, then scan the string again to find the first character with count 1.
  • String permutation checks: frequency comparison in O(n) instead of sorting in O(n log n).
  • Sliding window problems: maintain a frequency array for the current window and adjust counts as the window moves.

Google, Amazon, and Microsoft have all asked variants of these problems in on-campus coding rounds at Indian engineering colleges. The frequency array is the building block you reuse across all of them. In a typical 45-minute coding interview, you are unlikely to be asked just “count characters”. The interviewer asks it as a warm-up, then pivots to one of the harder variants above. Having the frequency-counting code written clean and fast in the first two minutes gives you time for the actual problem.

The difference between passing and failing these rounds often comes down to knowing which data structure fits which constraint. Fixed input alphabet and speed matters? Array. Unknown character set or need ordered iteration? Hashmap. If the follow-up asks “print characters in sorted order by frequency”, you will need to sort the hashmap entries or use a priority queue on top of the array. Knowing the base counting step cold means you spend your thinking time on the follow-up, not on the setup.

If you are preparing for multiple companies and want a structured comparison of platforms where you can drill these problems, see the placement preparation platform comparison. For students exploring roles beyond pure SDE positions, text-processing and analytics skills transfer to marketing and product analytics too.

Character frequency counting is also the first step in text tokenisation for NLP systems. When an LLM tokeniser builds its vocabulary, it starts by counting character and byte-pair frequencies across the training corpus. If you want to see how tokenisation actually works, split a sentence, watch the token IDs appear, and compare across models, TinkerLLM lets you do exactly that in a browser for ₹499. The connection between this placement-prep string problem and production NLP is one function call apart.

Primary sources

Frequently asked questions

Does the array approach work for Unicode strings?

No. A 256-element array covers only ASCII characters. For Unicode text containing Hindi, Tamil, or emoji characters, use a hashmap or dictionary that can key on any valid code point.

What is the time complexity of counting character frequency?

O(n) for a single pass through the string with either an array or a hashmap. The array has lower constant overhead because it avoids hash computation entirely.

How do I make the frequency count case-insensitive?

Convert each character to lowercase (or uppercase) before using it as an index or key. In C, use tolower(); in Python, call char.lower() before counting.

Why choose an array over a hashmap for ASCII strings?

The array gives direct O(1) indexing with no hash function, no collision handling, and no dynamic resizing. For the fixed 256-slot ASCII range, it is faster in practice even though both are O(n) overall.

Can this approach count word frequency instead of character frequency?

Not directly. Word frequency requires splitting the string by whitespace or delimiters first, then applying the same counting logic to each token rather than each character.

Build AI projects

A self-paced playground for building with LLMs.

TinkerLLM is FACE Prep's sister property. A guided environment for shipping real LLM applications, the kind of project that earns a paragraph on your resume, not a line.

Try TinkerLLM (₹499)
Free AI Roadmap PDF