Placement Prep

Python String Length: 5 Programs to Find Length of a String

Five methods to find string length in Python: len(), for loop, recursion, non-whitespace count, and Unicode code points vs grapheme clusters.

By FACE Prep Team 5 min read
python string-programs len-function unicode python-tutorial

Python’s built-in len() returns the length of any string in one call and runs in O(1) time because CPython stores the character count as a cached field on every string object.

That said, placement coding rounds at Tier-2 and Tier-3 colleges across India frequently ask you to demonstrate the iterative approach, explain recursion, or handle specific tasks such as counting non-whitespace characters. This article covers all five practical methods and ends with the Unicode detail that catches even experienced developers off guard.

Using len(): The Standard Approach

len(s) is the idiomatic Python way to get a string’s length. The Python documentation for len() describes it as returning the number of items in a container; for strings, that means the number of Unicode code points.

s = "FACE Prep"
print(len(s))  # 9
s = "Hello, World!"
print(len(s))  # 13

len() counts every character: letters, digits, spaces, tabs (\t), newlines (\n), and punctuation. Nothing is excluded. The empty string returns 0.

Why O(1)? CPython’s string implementation stores the character count alongside the character data in the string’s internal C struct. The len() call reads that cached field directly, with no loop and no iteration. This constant-time guarantee holds whether the string has 5 characters or 5 million. Use len() whenever the goal is just to measure.

Counting Characters with a for Loop

The for-loop accumulator replicates len() manually. It is O(n) in time because it visits each character exactly once and increments a counter.

def find_length(s):
    count = 0
    for char in s:
        count += 1
    return count

print(find_length("FACE Prep"))  # 9
print(find_length(""))           # 0

This is the canonical “show your work” implementation. When interviewers in placement technical rounds say “write this without using built-ins,” the for-loop version is the expected answer. The logic mirrors what len() does internally, with the only difference being that Python bytecode does the counting rather than CPython’s C layer, so the wall-clock time is higher on long strings.

A cleaner single-line variant using sum() and a generator expression:

length = sum(1 for _ in s)

Both produce identical results. The one-liner is idiomatic Python 3. The explicit loop is easier to explain step-by-step during a viva or whiteboard session, so it is worth knowing both forms.

The Recursive Approach

The recursive approach rests on one observation: the length of a non-empty string equals one plus the length of the string with its first character removed.

def length_recursive(s):
    if s == "":
        return 0
    return 1 + length_recursive(s[1:])

print(length_recursive("Python"))  # 6
print(length_recursive(""))        # 0
  • Base case: an empty string has length 0, so return 0 immediately.
  • Recursive case: strip the first character with s[1:], add 1, and recurse on the remainder.

The function is O(n) in time and O(n) in space because Python maintains one stack frame per recursive call. CPython’s default recursion limit is 1000, which you can check with sys.getrecursionlimit(). A string of more than 1000 characters will raise a RecursionError with this naive implementation.

That ceiling makes the recursive version academic rather than practical: useful for demonstrating recursion concepts in a tutorial or interview setting, but never the right choice for measuring string length in real code. Use len() or the for-loop approach for any string that might be long.

Counting Non-Whitespace Characters

Sometimes the question is not “how many characters total” but “how many non-whitespace characters.” This pattern appears in form validation, text density checks, and basic preprocessing tasks.

def count_nonwhitespace(s):
    return sum(1 for c in s if not c.isspace())

print(count_nonwhitespace("Hello World"))   # 10
print(count_nonwhitespace("  FACE Prep "))  # 8

.isspace() returns True for spaces, tabs, newlines, carriage returns, and other Unicode whitespace characters. The generator expression passes only the non-whitespace characters to sum(), producing a single-pass O(n) count with O(1) extra space.

The equivalent for-loop version, useful when you need to explain each step explicitly:

def count_nonwhitespace_loop(s):
    count = 0
    for c in s:
        if not c.isspace():
            count += 1
    return count

Both are correct. The generator expression is idiomatic Python 3 and is the form you will typically see in code reviews.

Understanding what .isspace() flags leads naturally into full character classification: whether a position holds a digit, upper-case letter, lower-case letter, or special character. The character classification program covers that ground directly.

Code Points, Not Grapheme Clusters

Python 3 strings are sequences of Unicode code points. len() counts code points, not bytes and not grapheme clusters. For most plain ASCII text this distinction makes no difference. It surfaces in two situations that matter in practice.

Composed versus decomposed characters

The character é can be represented as one code point (U+00E9, the precomposed form) or as two code points (the letter e at U+0065 followed by a combining acute accent at U+0301). Both look identical on screen. len() returns 1 for the precomposed form and 2 for the decomposed form.

Python’s unicodedata module handles this with NFC normalization, which collapses decomposed sequences into their single-code-point equivalents:

import unicodedata

s1 = "\u00e9"       # precomposed e-with-accent: 1 code point
s2 = "e\u0301"      # decomposed: e + combining acute: 2 code points
print(len(s1))      # 1
print(len(s2))      # 2

n = unicodedata.normalize("NFC", s2)
print(len(n))       # 1, collapsed to precomposed form

The Python Unicode HOWTO explains all four normalization forms and when each one applies.

Emoji and ZWJ sequences

An emoji such as the family sequence is a single grapheme cluster (one visual character) but is composed of multiple code points joined by Zero Width Joiner characters. len() counts each code point individually, returning a number larger than 1 for these sequences.

# Family emoji: man + ZWJ + woman + ZWJ + girl
emoji = "\U0001F468\u200D\U0001F469\u200D\U0001F467"
print(len(emoji))   # 5 (three person code points plus two ZWJ characters)

For grapheme-cluster-accurate counting, use the third-party grapheme library, which groups ZWJ sequences correctly and returns the visual character count.

The code-point distinction is small in isolation. In aggregate, it matters whenever you are building text truncation logic, message length validators, or anything that processes strings at the level language models operate with, where the unit is tokens rather than code points or bytes.

Time and Space Complexity

MethodTime complexitySpace complexityNotes
len(s)O(1)O(1)CPython caches length as a struct field
for-loop counterO(n)O(1)Iterates once, constant extra space
sum(1 for _ in s)O(n)O(1)Generator expression, no intermediate list
RecursiveO(n)O(n)Call stack grows one frame per character
Non-whitespace filterO(n)O(1)Single pass with isspace() predicate

For production code, len() is the right choice every time. The for-loop version communicates the algorithm clearly. The recursive version demonstrates recursion. Neither belongs in code that runs on user-supplied input of unknown length.

Further String Operations in Python

Measuring length is the gateway to string manipulation. Once you can count characters, the natural next questions are reordering: sorting a string alphabetically uses the same iteration logic with a comparison step added. For breadth, the character-type check program covers upper-case, lower-case, digit, and special-character detection directly.

For a wider set of foundational Python exercises that appear in placement coding rounds, Python basic programs collects the most common patterns with worked examples.

The observation that len() counts code points rather than grapheme clusters points to a broader truth: text processing at the level language models operate with uses tokens as the unit, not characters. If you have built the string intuition covered in this article, the step to understanding tokenization is shorter than it looks. TinkerLLM at ₹299 is a hands-on way to explore that connection; its exercises on prompt engineering and context windows build directly on string representation concepts from this article.

Primary sources

Frequently asked questions

Does len() count spaces in a Python string?

Yes. len() counts every character including spaces, tabs, and newlines. The string 'Hello World' returns 11 because the space is counted.

What is the time complexity of len() in Python?

O(1). CPython stores the length as a field in the internal string struct, so len() reads that field directly without iterating through characters.

Why does a for loop work to find string length?

Python strings are iterable sequences. The for loop visits each character once and an accumulator tracks the count, replicating what len() does in C internally.

Can I find string length without len() in Python?

Yes. A for-loop counter, a recursive function, or sum(1 for _ in s) all work. They run in O(n) rather than the O(1) of len(), but are useful for interview explanations.

Does len() work correctly on Unicode strings and emoji?

len() counts Unicode code points. An emoji that is a ZWJ sequence is multiple code points, so len() returns more than 1 for it. For grapheme-accurate counting, use the third-party grapheme library.

How do I count only non-whitespace characters in a Python string?

Use sum(1 for c in s if not c.isspace()). This iterates the string once and skips spaces, tabs, and newlines in a single pass.

Build AI projects

A self-paced playground for building with LLMs.

TinkerLLM is FACE Prep's sister property. A guided environment for shipping real LLM applications, the kind of project that earns a paragraph on your resume, not a line.

Try TinkerLLM (₹299 launch)
Free AI Roadmap PDF