mirror of http://bgp.hk.skcks.cn:10086/https://github.com/krahets/hello-algo synced 2026-04-20 21:00:58 +08:00

Files

Yudong Jin b01036b09e Revisit the English version (#1885 )

* Update giscus scroller.

* Refine English docs and landing page

* Sync the headings.

* Update landing pages.

* Update the avatar

* Update Acknowledgements

* Update landing pages.

* Update contributors.

* Update

* Fix the formula formatting.

* Fix the glossary.

* Chapter 6. Hashing

* Remove Chinese chars.

* Fix headings.

* Update giscus themes.

* fallback to default giscus theme to solve 429 many requests error.

* Add borders for callouts.

* docs: sync character encoding translations

* Update landing page media layout and i18n

2026-04-10 23:03:03 +08:00

7.2 KiB

Raw Blame History

Subset-Sum Problem

Without Duplicate Elements

!!! question

Given a positive integer array `nums` and a target positive integer `target`, find all possible combinations where the sum of elements in the combination equals `target`. The given array has no duplicate elements, and each element can be selected multiple times. Return these combinations in list form, where the list should not contain duplicate combinations.

For example, given the set \{3, 4, 5\} and target integer 9, the solutions are \{3, 3, 3\}, \{4, 5\}. Note the following two points:

Elements in the input set can be selected repeatedly without limit.
Subsets do not distinguish element order; for example, \{4, 5\} and \{5, 4\} are the same subset.

Using the Permutation Solution as a Reference

Similar to the permutation problem, we can view the process of generating subsets as the result of a series of choices and update the running sum during the selection process. When the sum equals target, we record the subset in the result list.

Unlike the permutation problem, elements in this problem can be selected any number of times, so we do not need to use a selected boolean list to track whether an element has already been selected. With a few small changes to the permutation code, we obtain an initial solution:

[file]{subset_sum_i_naive}-[class]{}-[func]{subset_sum_i_naive}

Running the above code on array [3, 4, 5] with target value 9 produces [3, 3, 3], [4, 5], [5, 4]. Although we successfully found all subsets that sum to 9, there are duplicate subsets [4, 5] and $[5, 4]$.

This is because the search process distinguishes the order of selections, but subsets do not distinguish selection order. As shown in the figure below, selecting 4 first and then 5 versus selecting 5 first and then 4 are different branches, but they correspond to the same subset.

To eliminate duplicate subsets, one straightforward idea is to deduplicate the result list. However, this approach is very inefficient for two reasons:

When there are many array elements, especially when target is large, the search process generates many duplicate subsets.
Comparing subsets (arrays) is very time-consuming, requiring sorting the arrays first, then comparing each element in them.

Pruning Duplicate Subsets

We consider deduplication through pruning during the search process. Observing the figure below, duplicate subsets occur when array elements are selected in different orders, as in the following cases:

When the first and second rounds select 3 and 4 respectively, all subsets containing these two elements are generated, denoted as [3, 4, \dots].
Afterward, when the first round selects 4, the second round should skip $3$, because the subset [4, 3, \dots] generated by this choice is an exact duplicate of the subset generated in step 1.

In the search process, each level's choices are tried from left to right, so the rightmost branches are pruned more.

The first two rounds select 3 and 5, generating subset [3, 5, \dots].
The first two rounds select 4 and 5, generating subset [4, 5, \dots].
If the first round selects 5, the second round should skip 3 and $4$, because subsets [5, 3, \dots] and [5, 4, \dots] are exact duplicates of the subsets described in steps 1. and 2.

In summary, given an input array [x_1, x_2, \dots, x_n], let the selection sequence in the search process be [x_{i_1}, x_{i_2}, \dots, x_{i_m}]. This selection sequence must satisfy i_1 \leq i_2 \leq \dots \leq i_m; any selection sequence that does not satisfy this condition will cause duplicates and should be pruned.

Code Implementation

To implement this pruning, we initialize a variable start to indicate the starting point of traversal. After making choice x_{i}, set the next round to start traversal from index $i$. This ensures that the selection sequence satisfies i_1 \leq i_2 \leq \dots \leq i_m, guaranteeing subset uniqueness.

In addition, we have made the following two optimizations to the code:

Before starting the search, first sort the array nums. When traversing all choices, end the loop immediately when the subset sum exceeds target, because subsequent elements are larger, and their subset sums must exceed target.
Omit the element sum variable total and use subtraction on target to track the sum of elements. Record the solution when target equals 0.

[file]{subset_sum_i}-[class]{}-[func]{subset_sum_i}

The figure below shows the complete backtracking process produced by running the above code on array [3, 4, 5] with target value 9.

With Duplicate Elements in Array

!!! question

Given a positive integer array `nums` and a target positive integer `target`, find all possible combinations where the sum of elements in the combination equals `target`. **The given array may contain duplicate elements, and each element can be selected at most once**. Return these combinations in list form, where the list should not contain duplicate combinations.

Compared to the previous problem, the input array in this problem may contain duplicate elements, which introduces a new issue. For example, given array [4, \hat{4}, 5] and target value 9, the output of the existing code is [4, 5], [\hat{4}, 5], which contains duplicate subsets.

The reason for this duplication is that equal elements are selected multiple times in a certain round. In the figure below, the first round has three choices, two of which are 4, creating two duplicate search branches that output duplicate subsets. Similarly, the two $4$'s in the second round also produce duplicate subsets.

Pruning Equal Elements

To solve this problem, we need to limit equal elements to be selected only once in each round. The implementation is quite clever: since the array is already sorted, equal elements are adjacent. This means that in a given round of selection, if the current element equals the element to its left, then the same value has already been chosen in this round, so we skip the current element directly.

At the same time, this problem specifies that each array element can only be selected once. Fortunately, we can also use the variable start to satisfy this constraint: after making choice x_{i}, set the next round to start traversal from index i + 1 onwards. This both eliminates duplicate subsets and avoids selecting elements multiple times.

Code Implementation

[file]{subset_sum_ii}-[class]{}-[func]{subset_sum_ii}

The figure below shows the backtracking process for array [4, 4, 5] with target value 9, which includes four types of pruning operations. Combine the illustration with the code comments to understand the entire search process and how each pruning operation works.

7.2 KiB Raw Blame History

Subset-Sum Problem

Without Duplicate Elements

Using the Permutation Solution as a Reference

Pruning Duplicate Subsets

Code Implementation

With Duplicate Elements in Array

Pruning Equal Elements

Code Implementation

7.2 KiB

Raw Blame History