Placement Prep

Sort Elements by Frequency in an Array: Three Approaches

Three approaches to sort array elements by frequency: brute force, hash map, and Python Counter, with complexity analysis and worked examples.

By FACE Prep Team 5 min read
frequency-sort arrays data-structures dsa placement-prep python hash-map technical-round

Sorting an array by element frequency places the most-repeated values first, with equal-frequency elements keeping their original left-to-right order. Placement coding rounds use this pattern to test whether you default to nested loops or reach for a hash map.

Problem Statement and Test Cases

Given an array of integers, rearrange the elements in decreasing order of their frequency. When two elements share the same frequency, the one that appeared earlier in the original input comes first in the output.

Worked test cases (verify the logic manually before reading the solution):

  • Test Case 1

    • Input: [2, 3, 2, 4, 5, 12, 2, 3, 3, 3, 12]
    • Frequencies: 2 appears 3 times, 3 appears 4 times, 4 once, 5 once, 12 twice
    • Expected output: [3, 3, 3, 3, 2, 2, 2, 12, 12, 4, 5]
  • Test Case 2

    • Input: [1, 2, 3, 2, 1, 1, 4, 5, 6, 6]
    • Frequencies: 1 appears 3 times, 2 appears twice, 3 once, 4 once, 5 once, 6 twice
    • Expected output: [1, 1, 1, 2, 2, 6, 6, 3, 4, 5]

In Test Case 2, both 2 and 6 have frequency 2. Element 2 first appears at index 1 in the original input; element 6 first appears at index 8. So 2 comes before 6 in the output. The same tiebreaking logic applies in other array-manipulation problems that care about relative order, such as finding all symmetric pairs in an array.

Note also that Test Case 2 has three elements tied at frequency 1 (elements 3, 4, and 5). They appear in the output as 3, 4, 5 because their first occurrences in the input are at indices 2, 6, and 7 respectively.

Approach 1: Count and Sort Without a Hash Map

The brute-force approach counts element frequencies using nested loops and then sorts.

Algorithm

  • Step 1: For each element in the array, scan the entire array to count how many times it occurs. Store that count.
  • Step 2: Sort the array using those counts as the primary sort key, descending.
  • Step 3: Use a stable sort so elements with equal counts keep their original relative order.

When this works and when it does not

Time complexity is O(n²) because for every one of n elements you scan all n elements to count.

On a small input (n under 500), the brute-force approach is readable and runs fast enough. On a placement judge with n = 10000 or n = 100000 elements (the constraint sizes commonly used in online assessments), the brute-force will hit the time limit. The hash-map approach below is what judges expect.

Space complexity is O(1) auxiliary: you need a few counters and a sorted output, nothing that grows with the number of distinct elements.

Approach 2: Hash Map with Custom Sort

The hash-map approach replaces nested-loop counting with a single pass over the array, reducing frequency-counting from O(n²) to O(n).

Algorithm

  • Step 1: Traverse the array once and build a frequency map (element to count). This costs O(n) time and O(k) space, where k is the number of distinct elements.
  • Step 2: In the same pass, record the first occurrence index of each element. You need this for tiebreaking.
  • Step 3: Call sorted() with a tuple key (-frequency, first_occurrence_index). The negative frequency sorts by count descending; the first-occurrence index breaks ties by original order.
  • Step 4: Return the sorted array. Python’s sorted() is guaranteed stable, but the first-occurrence index makes the tiebreaking explicit and independent of that guarantee.

The arr.index() trap

A naive key function puts arr.index(x) inside the sort call:

sorted(arr, key=lambda x: (-freq[x], arr.index(x)))

This looks correct and produces the right output. But arr.index(x) scans the array from position 0 every time it is called. The sort calls the key function O(n log n) times (once per comparison), and each call is O(n), so the total cost balloons to O(n² log n). On a 10000-element array that is roughly 10⁸ operations.

Fix: build the first-occurrence dictionary before calling sorted, so the key function does an O(1) dictionary lookup instead:

first_pos = {}
for i, x in enumerate(arr):
    if x not in first_pos:
        first_pos[x] = i

Measuring the cost of a sort key separately from the sort itself is a principle that applies broadly. The equilibrium index problem is another case where a naive recalculation inside a loop inflates O(n) work to O(n²).

Python Implementation and Walkthrough

The implementation below uses Python’s collections.Counter for frequency counting and the tuple-key pattern for sorting:

from collections import Counter

def sort_by_frequency(arr):
    # Count frequencies in O(n)
    freq = Counter(arr)

    # Record first occurrence index in O(n)
    first_pos = {}
    for i, x in enumerate(arr):
        if x not in first_pos:
            first_pos[x] = i

    # Sort: primary key = frequency descending, secondary = original order
    return sorted(arr, key=lambda x: (-freq[x], first_pos[x]))

# Test Case 1
print(sort_by_frequency([2, 3, 2, 4, 5, 12, 2, 3, 3, 3, 12]))
# Output: [3, 3, 3, 3, 2, 2, 2, 12, 12, 4, 5]

# Test Case 2
print(sort_by_frequency([1, 2, 3, 2, 1, 1, 4, 5, 6, 6]))
# Output: [1, 1, 1, 2, 2, 6, 6, 3, 4, 5]

Step-by-step walkthrough — Test Case 1

Input: [2, 3, 2, 4, 5, 12, 2, 3, 3, 3, 12]

  • Step 1 — Count: Counter({3: 4, 2: 3, 12: 2, 4: 1, 5: 1})
  • Step 2 — First positions: {2: 0, 3: 1, 4: 3, 5: 4, 12: 5}
  • Step 3 — Tuple keys per distinct value:
    • 3 gets key (-4, 1)
    • 2 gets key (-3, 0)
    • 12 gets key (-2, 5)
    • 4 gets key (-1, 3)
    • 5 gets key (-1, 4)
  • Step 4 — Ascending sort of keys: (-4, 1) then (-3, 0) then (-2, 5) then (-1, 3) then (-1, 4)
  • Output: [3, 3, 3, 3, 2, 2, 2, 12, 12, 4, 5]

Notice how element 3 has key (-4, 1) and element 2 has key (-3, 0). Numerically -4 < -3, so 3 sorts ahead of 2 even though 2 has a smaller first-occurrence index (0 vs 1). The frequency is the primary key; first-occurrence position only matters for elements with identical frequencies.

Time and Space Complexity

ApproachTimeAuxiliary spaceSuitable for
Brute force (nested loops)O(n²)O(1)n under 500; no extra memory allowed
Hash map with first_pos dictO(n log n)O(n)Standard placement answer; n up to 10⁶
Counter + first_pos (Python)O(n log n)O(n)Same as above; cleaner code

For most placement coding rounds, the hash-map version is the expected answer. The brute-force will time-out on constraints above roughly 10000 elements. For a deeper look at why O(n) vs O(1) auxiliary space matters in memory-constrained environments, see space complexity of algorithms with examples.

Frequency counting (building a hash map of element counts and ranking by that count) is the same operation at the core of most text-processing pipelines. When an LLM is trained, one of the earliest steps is building a frequency distribution over billions of tokens to decide which subwords go into the vocabulary. The sort_by_frequency function above is structurally identical to that step, scaled down. If that connection interests you, TinkerLLM lets you run tokenisation experiments and inspect token frequency distributions at ₹299, a practical way to see the same data structure applied to real language data.

Primary sources

Frequently asked questions

What does stable sort mean in frequency sorting?

A stable sort preserves the relative order of elements that compare as equal. When two elements share the same frequency, a stable sort keeps the one that appeared earlier in the input ahead of the one that appeared later. Python's built-in sorted() is stable, which is why adding the first-occurrence index as a secondary sort key produces correct tiebreaking.

Why is arr.index() inside the sort key function inefficient?

arr.index(x) scans the array from the beginning each time it is called, taking O(n) per call. Because sorted() evaluates the key function once per element, the total cost becomes O(n) multiplied by O(n log n) comparisons, giving O(n² log n). Storing first-occurrence positions in a dictionary before sorting reduces this to O(1) per key lookup and O(n log n) total.

What is the time complexity of sorting by frequency with a hash map?

Counting frequencies with Counter takes O(n). Sorting with a tuple key (negative frequency, first-occurrence index) takes O(n log n). Total time complexity is O(n log n), which is the dominant term.

How do I sort by frequency in ascending order instead of descending?

Change the sort key from -freq[x] to freq[x]. The element that appears least often will come first. The tiebreaking logic using first-occurrence index stays the same.

What is the space complexity of the hash-map approach?

O(n) auxiliary space. The Counter dictionary holds at most n distinct keys, and the first_pos dictionary also holds at most n entries. Python's sorted() creates an internal O(n) working buffer. Total auxiliary space is O(n).

Does frequency sorting appear in placement coding tests?

Yes. It tests a combination of hash-map usage, custom sort keys, and awareness of tiebreaking edge cases. Expect it in the technical-coding sections of placement drives at companies that run online assessments with algorithm problems.

Build AI projects

A self-paced playground for building with LLMs.

TinkerLLM is FACE Prep's sister property. A guided environment for shipping real LLM applications, the kind of project that earns a paragraph on your resume, not a line.

Try TinkerLLM (₹299 launch)
Free AI Roadmap PDF