Placement Prep

Array Subset Check in C, C++, Java and Python

Four methods to check if an array is a subset of another, with C, C++, Java and Python code. Time complexity and duplicate handling for placement interviews.

By FACE Prep Team 6 min read
arrays data-structures cpp java python placement-coding

An array arr2 is a subset of arr1 when every element in arr2 exists in arr1 at least as many times. Four classic methods solve this problem: brute-force two loops, sort with binary search, sort with merge scan, and hashing. This guide shows code in C, C++, Java, and Python, with complexity analysis and edge-case handling for placement coding rounds.

What “Subset” Means for Arrays

A subset relationship depends on how you define element presence. Two common definitions exist:

  • Set semantics: Each distinct value in arr2 must appear at least once in arr1. Duplicates are ignored.
  • Multiset semantics: Each value in arr2 must appear in arr1 with at least the same frequency.

Most placement questions use set semantics unless they explicitly mention “duplicates matter.” Ask the interviewer which definition applies before coding.

Examples

  • arr1 = {1, 2, 3, 4, 5} and arr2 = {3, 4, 5}: Yes, subset.
  • arr1 = {1, 2, 3, 4, 5} and arr2 = {1, 2, 9}: No, 9 is missing.
  • arr1 = {1, 2, 3} and arr2 = {2, 2}: Under set semantics, yes. Under multiset semantics, no (only one 2 in arr1).

Edge Cases

  • Empty arr2 is always a subset (vacuously true).
  • Identical arrays (arr1 == arr2) return true under both definitions.
  • Negative values and zeros work with all four methods because comparisons and hash keys are value-based.

Method 1: Two Loops

The brute-force approach checks each element of arr2 against every element of arr1. For each element at index i in arr2, the inner loop scans all of arr1 looking for a match. If any element goes unmatched, the function returns false.

C

#include <stdio.h>
#include <stdbool.h>

bool isSubset(int arr1[], int m, int arr2[], int n) {
    for (int i = 0; i < n; i++) {
        bool found = false;
        for (int j = 0; j < m; j++) {
            if (arr2[i] == arr1[j]) {
                found = true;
                break;
            }
        }
        if (!found) return false;
    }
    return true;
}

int main() {
    int arr1[] = {1, 2, 3, 4, 5};
    int arr2[] = {3, 4, 5};
    int m = sizeof(arr1) / sizeof(arr1[0]);
    int n = sizeof(arr2) / sizeof(arr2[0]);
    printf("%s\n", isSubset(arr1, m, arr2, n) ? "Yes" : "No");
    return 0;
}

Java

public class SubsetCheck {
    static boolean isSubset(int[] arr1, int[] arr2) {
        for (int val : arr2) {
            boolean found = false;
            for (int elem : arr1) {
                if (val == elem) {
                    found = true;
                    break;
                }
            }
            if (!found) return false;
        }
        return true;
    }
}

Python

def is_subset(arr1, arr2):
    for val in arr2:
        if val not in arr1:
            return False
    return True

Python’s in operator performs a linear scan on lists, so this is still O(m) per element despite the compact syntax.

Complexity: O(m x n) time, O(1) space. Acceptable when both arrays contain fewer than 1000 elements.

Sorting arr1 lets you search each arr2 element in O(log m) time instead of O(m). C uses qsort and bsearch from stdlib.h because the language has no built-in sort operator. Java’s TreeSet gives guaranteed O(log m) lookup per element via a red-black tree.

C

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

int compare(const void *a, const void *b) {
    return (*(int *)a - *(int *)b);
}

bool isSubset(int arr1[], int m, int arr2[], int n) {
    qsort(arr1, m, sizeof(int), compare);
    for (int i = 0; i < n; i++) {
        if (bsearch(&arr2[i], arr1, m, sizeof(int), compare) == NULL) {
            return false;
        }
    }
    return true;
}

Java (TreeSet)

import java.util.TreeSet;

public class SubsetCheck {
    static boolean isSubset(int[] arr1, int[] arr2) {
        TreeSet<Integer> set = new TreeSet<>();
        for (int val : arr1) set.add(val);
        for (int val : arr2) {
            if (!set.contains(val)) return false;
        }
        return true;
    }
}

Python

def is_subset_sorted(arr1, arr2):
    sorted_arr1 = sorted(arr1)
    from bisect import bisect_left
    for val in arr2:
        idx = bisect_left(sorted_arr1, val)
        if idx == len(sorted_arr1) or sorted_arr1[idx] != val:
            return False
    return True

Complexity: O(m log m + n log m) time, O(1) auxiliary space with in-place sort, or O(m) if using TreeSet.

Duplicate caveat: Binary search finds the first occurrence but does not verify frequency. This method works only under set semantics. If arr2 contains repeated values, you need frequency tracking (Method 4).

Method 3: Sort and Merge Scan

Sorting both arrays and then running a two-pointer merge scan avoids repeated searches. The scan advances through both arrays simultaneously: if the current arr1 element is smaller, advance its pointer; if equal, advance both; if arr1 is larger, the arr2 element is missing.

C++

#include <algorithm>
#include <iostream>
using namespace std;

bool isSubset(int arr1[], int m, int arr2[], int n) {
    sort(arr1, arr1 + m);
    sort(arr2, arr2 + n);
    int i = 0, j = 0;
    while (i < m && j < n) {
        if (arr1[i] < arr2[j]) i++;
        else if (arr1[i] == arr2[j]) { i++; j++; }
        else return false;
    }
    return j == n;
}

Java

import java.util.Arrays;

public class SubsetCheck {
    static boolean isSubset(int[] arr1, int[] arr2) {
        Arrays.sort(arr1);
        Arrays.sort(arr2);
        int i = 0, j = 0;
        while (i < arr1.length && j < arr2.length) {
            if (arr1[i] < arr2[j]) i++;
            else if (arr1[i] == arr2[j]) { i++; j++; }
            else return false;
        }
        return j == arr2.length;
    }
}

Complexity: O(m log m + n log n) time, O(1) extra space with in-place sort.

This method handles duplicates correctly when both arrays are sorted because pointer advancement implicitly matches each occurrence. It is the best option when you cannot afford O(m) auxiliary memory.

Method 4: Hashing

Hashing builds a frequency map from arr1 and verifies each arr2 element against it. Each lookup is O(1) on average, and decrementing counts after each match handles duplicates correctly. This is the fastest approach for general-purpose subset checks.

C++ (unordered_map)

#include <unordered_map>
#include <iostream>
using namespace std;

bool isSubset(int arr1[], int m, int arr2[], int n) {
    unordered_map<int, int> freq;
    for (int i = 0; i < m; i++) freq[arr1[i]]++;
    for (int i = 0; i < n; i++) {
        if (freq[arr2[i]] == 0) return false;
        freq[arr2[i]]--;
    }
    return true;
}

Java (HashMap)

import java.util.HashMap;

public class SubsetCheck {
    static boolean isSubset(int[] arr1, int[] arr2) {
        HashMap<Integer, Integer> freq = new HashMap<>();
        for (int val : arr1) freq.put(val, freq.getOrDefault(val, 0) + 1);
        for (int val : arr2) {
            if (freq.getOrDefault(val, 0) == 0) return false;
            freq.put(val, freq.get(val) - 1);
        }
        return true;
    }
}

Python (Counter)

from collections import Counter

def is_subset_counter(arr1, arr2):
    c1, c2 = Counter(arr1), Counter(arr2)
    return not (c2 - c1)

The expression c2 - c1 returns a Counter containing only elements where c2[x] > c1[x]. If the result is empty (falsy), every element of arr2 appears in arr1 with sufficient frequency.

For unique-element arrays where duplicates are guaranteed absent, Python’s built-in set.issubset is even shorter:

def is_subset_set(arr1, arr2):
    return set(arr2).issubset(set(arr1))

Complexity: O(m + n) time, O(m) space for the frequency map.

Complexity Table and Method Selection

MethodTimeSpaceDuplicates
Two loopsO(m x n)O(1)Set only
Sort + binary searchO(m log m + n log m)O(1)Set only
Sort + mergeO(m log m + n log n)O(1)Correct
HashingO(m + n)O(m)Correct

For placement rounds, default to hashing. State O(m + n) time and O(m) space in your complexity analysis. If the interviewer asks for an O(1)-space solution, switch to sort-and-merge.

Choosing the right method also depends on input constraints. When m and n are both below 100, the brute-force approach is fine and easier to code under time pressure. When arrays arrive pre-sorted (common in competitive programming inputs), skip the sort step and run only the merge scan.

For foundational array traversal patterns like finding extremes in a single pass, see finding the smallest and largest element in an array. More sorting-based array problems appear in arranging flower sticks in a bouquet. A broader set of interview-ready patterns is at 20 most asked data structures interview questions.

The hash-map frequency check uses O(1) average lookup per key, the same constant-time key-value access pattern that drives LLM attention caches. If you want to trace how token lookups scale from Python dictionaries to transformer internals, TinkerLLM walks that path for ₹299.

Primary sources

Frequently asked questions

Why does C need qsort when C++ has std::sort?

C lacks templates and overloaded operators, so the standard library provides qsort with a comparator callback. C++ std::sort inlines comparisons and typically runs faster for the same data.

Does Java TreeSet handle duplicate elements correctly for subset checks?

TreeSet ignores duplicates because it stores unique keys. If arr2 contains duplicates, use a HashMap with frequency counts instead.

What does Python set.issubset return for lists with duplicates?

set.issubset converts lists to sets first, discarding duplicates. For multiset membership, use collections.Counter subtraction.

What time complexity should I state in a placement coding round?

State O(m+n) time and O(m) space for the hashing approach. Interviewers expect you to mention the trade-off between time and auxiliary memory.

Can Python Counter subtraction check subset membership?

Yes. If counter2 - counter1 is empty, every element of arr2 appears in arr1 with sufficient frequency. This handles duplicates correctly.

When should I use sort-and-merge over hashing?

Use sort-and-merge when you cannot allocate O(m) extra space and the input arrays are already sorted or sorting overhead is acceptable.

Build AI projects

A self-paced playground for building with LLMs.

TinkerLLM is FACE Prep's sister property. A guided environment for shipping real LLM applications, the kind of project that earns a paragraph on your resume, not a line.

Try TinkerLLM (₹299 launch)
Free AI Roadmap PDF