Placement Prep

Find Duplicate Elements in an Array: 5 Approaches

Five approaches to find duplicates in an array: O(n²) nested loops, sort-and-scan, hash set, XOR trick, and Floyd's cycle detection, with C, C++, and Python code.

By FACE Prep Team May 2026 7 min read

arrays algorithms c-plus-plus python data-structures placement-prep

Finding duplicate elements in an array is a standard placement coding question with five distinct solutions, each trading off time, space, and input constraints differently.

Two problem variants appear in tests. The general version has no range restriction: the array may hold any integer values, and any number of elements may repeat. The constrained version specifies that the array has n+1 elements with values in [1..n], which enables the O(n) O(1) solutions in Approaches 4 and 5. Approaches 1 through 3 work on both; Approaches 4 and 5 work only on the constrained version.

Approach 1: Nested-Loop Brute Force (C)

For each element, scan every element to its right. A match means the element is a duplicate.

Algorithm

Traverse the array from index 0 to n-2 (outer loop, index i).
For each arr[i], scan from index i+1 to n-1 (inner loop, index j).
If arr[i] == arr[j], print arr[i] and break the inner loop to avoid printing the same duplicate twice.

C Code

#include <stdio.h>

void findDuplicates(int arr[], int n) {
    printf("Repeating elements:\n");
    for (int i = 0; i < n - 1; i++) {
        for (int j = i + 1; j < n; j++) {
            if (arr[i] == arr[j]) {
                printf("%d ", arr[i]);
                break;
            }
        }
    }
    printf("\n");
}

int main() {
    int arr[] = {4, 3, 2, 7, 8, 3, 4};
    int n = sizeof(arr) / sizeof(arr[0]);
    findDuplicates(arr, n);
    return 0;
}

Output: Repeating elements: 4 3

Trace on `{4, 3, 2, 7, 8, 3, 4}`

i=0, arr[i]=4: check j=1(3), j=2(2), j=3(7), j=4(8), j=5(3), j=6(4) — match at j=6, print 4
i=1, arr[i]=3: check j=2(2), j=3(7), j=4(8), j=5(3) — match at j=5, print 3
i=2 through i=5: no matching element to the right

Complexity

Time: O(n²) — the inner loop runs up to n-1 iterations per outer step.
Space: O(1) — no auxiliary data structure beyond the input array.

This is the correct baseline. An interviewer who asks for a more efficient solution is testing whether you know the hash-set or sorted approach. Present this first, then improve.

Approach 2: Sort and Scan (C++)

Sort the array. Adjacent duplicates end up adjacent in sorted order and are found in a single scan.

Algorithm

Sort the array with std::sort.
Scan from index 1 to n-1.
If arr[i] == arr[i-1], print arr[i].

C++ Code

#include <iostream>
#include <algorithm>
using namespace std;

void findDuplicates(int arr[], int n) {
    sort(arr, arr + n);
    cout << "Repeating elements:\n";
    for (int i = 1; i < n; i++) {
        if (arr[i] == arr[i - 1]) {
            cout << arr[i] << " ";
        }
    }
    cout << "\n";
}

int main() {
    int arr[] = {4, 3, 2, 7, 8, 3, 4};
    int n = sizeof(arr) / sizeof(arr[0]);
    findDuplicates(arr, n);
    return 0;
}

Sorted array: {2, 3, 3, 4, 4, 7, 8}. Scan finds 3 (at index 2) and 4 (at index 4).

Complexity

Time: O(n log n) — dominated by std::sort.
Space: O(1) — sorting is in-place.

std::sort modifies the original array. If the rest of the program needs the original order preserved, copy the array before sorting or use the hash-set approach instead.

Approach 3: Hash Set (Python)

A set tracks which elements have already been seen. On each iteration, if the current element is already in the set, it is a duplicate.

Algorithm

Initialise an empty set seen.
For each element in the array:
- If the element is in seen, record it as a duplicate.
- Otherwise, add it to seen.

Python Code

def find_duplicates(arr):
    seen = set()
    duplicates = []
    for val in arr:
        if val in seen:
            duplicates.append(val)
        else:
            seen.add(val)
    return duplicates

arr = [4, 3, 2, 7, 8, 3, 4]
print("Repeating elements:", find_duplicates(arr))

Output: Repeating elements: [3, 4]

Python’s set type uses a hash table internally, so membership tests and insertions run in O(1) average time.

Trace on `{4, 3, 2, 7, 8, 3, 4}`

val=4: not in seen, add. seen=4
val=3: not in seen, add. seen=3
val=2: not in seen, add. seen=2
val=7: not in seen, add. seen=7
val=8: not in seen, add. seen=8
val=3: in seen, append 3
val=4: in seen, append 4

Output: [3, 4]

Complexity

Time: O(n) — one pass, O(1) average per element for set operations.
Space: O(n) — the set holds up to n distinct elements.

This is the default answer for the general problem. The trade-off is space: the set grows proportionally with the number of distinct values in the array.

Approach 4: XOR Trick (Constrained Problem Only)

Constraint: same constraint as Approach 5. The array has n+1 elements, all values are integers in [1..n], and exactly one value is repeated.

XOR of any value with itself equals 0. XOR of any value with 0 returns the value unchanged. XOR is commutative and associative.

If you XOR all array elements and then XOR the result with 1 through n, each number in [1..n] that appears exactly once cancels out. Only the repeated number remains, because it contributes one extra XOR that does not cancel.

Algorithm

Compute arr_xor as the XOR of all elements in the array.
Compute range_xor as the XOR of all integers from 1 to n (where n is the maximum value in [1..n]).
The duplicate equals arr_xor XOR range_xor.

C Code

#include <stdio.h>

int findDuplicate(int arr[], int n) {
    int arr_xor = 0, range_xor = 0;
    for (int i = 0; i < n; i++) arr_xor ^= arr[i];
    for (int i = 1; i < n; i++) range_xor ^= i;
    return arr_xor ^ range_xor;
}

int main() {
    /* n+1 = 5 elements, values in [1..4], duplicate = 2 */
    int arr[] = {1, 3, 4, 2, 2};
    int n = 5;
    printf("Duplicate: %d\n", findDuplicate(arr, n));
    return 0;
}

Output: Duplicate: 2

Trace on `{1, 3, 4, 2, 2}` (n=4, array has 5 elements)

arr_xor: 1 XOR 3 XOR 4 XOR 2 XOR 2 = 6
range_xor: 1 XOR 2 XOR 3 XOR 4 = 4
duplicate: 6 XOR 4 = 2

Binary check: 110 XOR 100 = 010 = 2. Confirmed.

Complexity

Time: O(n) — two linear passes.
Space: O(1) — two integer variables.

The XOR trick returns only one duplicate value and only works under the [1..n] constraint. For the general problem, use the hash set.

Approach 5: Floyd’s Cycle Detection

Constraint: the array has n+1 elements, values in [1..n], exactly one duplicate.

Map the array as an implicit linked list: treat index i as a node whose next pointer is arr[i]. Because all values fall in [1..n] and the array has n+1 entries, at least two indices point to the same next node, forming a cycle. The entry point of that cycle is the duplicate.

Floyd’s algorithm finds the cycle in two phases. Phase 1 moves a slow pointer (one step per iteration) and a fast pointer (two steps per iteration) until they meet inside the cycle. Phase 2 resets the slow pointer to index 0 and advances both pointers one step at a time; their new meeting point is the cycle entry, which is the duplicate.

Python Code

def find_duplicate_floyd(arr):
    slow = arr[0]
    fast = arr[arr[0]]

    # Phase 1: find meeting point inside the cycle
    while slow != fast:
        slow = arr[slow]
        fast = arr[arr[fast]]

    # Phase 2: find cycle entry = duplicate
    slow = 0
    while slow != fast:
        slow = arr[slow]
        fast = arr[fast]

    return slow

arr = [1, 3, 4, 2, 2]
print("Duplicate:", find_duplicate_floyd(arr))

Output: Duplicate: 2

Trace on `{1, 3, 4, 2, 2}`

Mapping from index to next: 0 to 1, 1 to 3, 2 to 4, 3 to 2, 4 to 2. The path from node 0: 0, 1, 3, 2, 4, 2, 4, … (cycle: nodes 2 and 4).

Phase 1 (slow starts at arr[0]=1, fast starts at arr[arr[0]]=3):

Step 1: slow=arr[1]=3, fast=arr[arr[3]]=arr[2]=4
Step 2: slow=arr[3]=2, fast=arr[arr[4]]=arr[2]=4
Step 3: slow=arr[2]=4, fast=arr[arr[4]]=arr[2]=4, slow equals fast at 4

Phase 2 (slow reset to 0, fast stays at 4):

Step 1: slow=arr[0]=1, fast=arr[4]=2
Step 2: slow=arr[1]=3, fast=arr[2]=4
Step 3: slow=arr[3]=2, fast=arr[4]=2, slow equals fast at 2

Duplicate is 2. Confirmed.

Complexity

Time: O(n) — both phases are linear.
Space: O(1) — two pointer variables.

Floyd’s is the optimal approach for the constrained problem: O(n) time, O(1) space, and it does not modify the array. The sort approach (Approach 2) also uses O(1) space but costs O(n log n) and mutates the array.

Complexity Comparison

Approach	Time	Space	Finds All Duplicates	Input Constraint
Brute force (C)	O(n²)	O(1)	Yes	None
Sort and scan (C++)	O(n log n)	O(1)	Yes	Mutates array
Hash set (Python)	O(n)	O(n)	Yes	None
XOR trick (C)	O(n)	O(1)	No — single only	Values in [1..n], one duplicate
Floyd’s cycle (Python)	O(n)	O(1)	No — single only	Values in [1..n], one duplicate

For the general problem, the hash set is the correct O(n) solution. For the constrained problem, Floyd’s is optimal: O(n) time, O(1) space, non-destructive.

Placement Interview Tips

Duplicate-detection problems appear in three forms in Indian placement tests.

The first is a direct coding question: write a function that returns all duplicate elements. The expected answer is Approach 3 (hash set). Approach 1 (brute force) is an acceptable starting point, but the interviewer will follow up asking for a more efficient solution.

The second form adds a space constraint: find the duplicate using O(1) extra memory. This is the cue for sorting (Approach 2) or, if the values are confirmed to be in [1..n], for Floyd’s (Approach 5).

The third form is a reasoning question: given an array where every value is in [1..n] and exactly one appears twice, which algorithm is most efficient? The answer is Floyd’s: O(n) time and O(1) space without modifying the array.

For the full range of array and data-structure problems that appear across these tests, see data structures interview questions. The single-pass update pattern in Approach 3 also underlies the O(n) scan in finding the smallest and largest element: initialise a candidate at index 0, walk once, update on each comparison.

The O(n) hash-set pass in Approach 3 is the same deduplication loop used in retrieval-augmented generation pipelines: retrieve candidate documents, scan once with a seen set, discard any already encountered. TinkerLLM at ₹299 builds that connection through exercises that extend single-pass array logic toward context-window management in LLM inference. The palindrome-check pattern is another O(n) single-pass variant worth reviewing alongside this article.

Primary sources

Frequently asked questions

What is the most efficient approach to find all duplicates in an array?

The hash-set approach finds all duplicates in O(n) time and O(n) space with a single pass. Use it as the default unless the problem explicitly requires O(1) extra space.

When does the XOR trick work for finding duplicates?

The XOR trick works when the array has exactly n+1 elements with values in the range [1..n] and exactly one element is repeated. For arrays with multiple duplicates or arbitrary value ranges, use a hash set instead.

Does sort-and-scan modify the original array?

Yes. std::sort in C++ is in-place and overwrites the original element order. If the original order must be preserved, copy the array before sorting or use the hash-set approach instead.

Can Floyd's cycle detection find multiple duplicates in one pass?

No. Floyd's algorithm identifies the entry point of one cycle, which corresponds to one duplicate value. If the array has multiple duplicates, only one is returned.

Why does Floyd's cycle detection use O(1) space?

The algorithm uses only two pointer variables regardless of array size. A hash set grows with n because it stores every unique element seen so far.

What should I say first when asked to find duplicates in a placement interview?

Ask whether extra space is allowed. If yes, use a hash set in O(n) time and O(n) space. If no, sort-and-scan in O(n log n) or Floyd's if the constrained n+1 range applies.

Build AI projects

A self-paced playground for building with LLMs.

TinkerLLM is FACE Prep's sister property. A guided environment for shipping real LLM applications, the kind of project that earns a paragraph on your resume, not a line.

Try TinkerLLM (₹299 launch)

Share WhatsApp LinkedIn Twitter

Approach 1: Nested-Loop Brute Force (C)

Algorithm

C Code

Trace on {4, 3, 2, 7, 8, 3, 4}

Complexity

Approach 2: Sort and Scan (C++)

Algorithm

C++ Code

Complexity

Approach 3: Hash Set (Python)

Algorithm

Python Code

Trace on {4, 3, 2, 7, 8, 3, 4}

Complexity

Approach 4: XOR Trick (Constrained Problem Only)

Algorithm

C Code

Trace on {1, 3, 4, 2, 2} (n=4, array has 5 elements)

Complexity

Approach 5: Floyd’s Cycle Detection

Python Code

Trace on {1, 3, 4, 2, 2}

Complexity

Complexity Comparison

Placement Interview Tips

Primary sources

Frequently asked questions

A self-paced playground for building with LLMs.

Trace on `{4, 3, 2, 7, 8, 3, 4}`

Trace on `{4, 3, 2, 7, 8, 3, 4}`

Trace on `{1, 3, 4, 2, 2}` (n=4, array has 5 elements)

Trace on `{1, 3, 4, 2, 2}`