MediumRating 1405

274. H-Index

arraysortingcounting-sort

解題說明

H-Index

題目描述：給定一位研究員的引用次數陣列 citations，其中 citations[i] 表示第 i 篇論文的被引用次數。H-index 定義為：最大的整數 h，使得該研究員至少有 h 篇論文各被引用至少 h 次。

解題思路：將 citations 由大到小排序後，逐一檢查每篇論文。在排序後的陣列中，索引 i（從 0 開始）代表「至少有 i+1 篇論文」，而 citations[i] 是這 i+1 篇中引用次數最少的。只要 citations[i] >= i+1，就滿足 H-index 至少為 i+1 的條件。因此，從左到右遍歷，找到最大的 i 使得 citations[i] >= i+1，答案即為 i+1。若沒有找到任何滿足條件的 i，則 H-index 為 0。另一種等價做法是從右往左找到第一個 citations[i] < i+1 的位置，答案為 i。

C++ 解法

複雜度分析

時間複雜度：O(n log n) — 排序需要 O(n log n)，之後的線性掃描為 O(n)，總體由排序決定。若使用計數排序（Counting Sort）版本，可降至 O(n)。

空間複雜度：O(1) — 排序版本只需常數額外空間（假設原地排序）。計數排序版本需要 O(n) 的計數陣列空間。

虛擬碼

Approach 1 - Sort + Linear Scan:
1. Sort citations in descending order
2. Initialize h = 0
3. For i from 0 to n-1:
   a. If citations[i] >= i+1:
      - Update h = i+1 (we have i+1 papers with at least i+1 citations each)
   b. Else:
      - Break (sorted order guarantees no further improvement)
4. Return h

Approach 2 - Counting Sort (O(n)):
1. Create count array of size n+1, all zeros
2. For each citation c: count[min(c, n)]++  (cap at n since h <= n always)
3. Traverse from h=n down to 0, accumulate total papers
4. Return first h where total >= h

其他解法

方法一：計數排序（Counting Sort） 時間複雜度：O(n)，空間複雜度：O(n) — 由於 h-index 不可能超過論文總數 n，可以建立大小為 n+1 的計數陣列。對每篇引用次數 c，將其累加到 count[min(c, n)] 中。接著從 h=n 開始向下累計，找到第一個使得累計論文數 ≥ h 的值，即為答案。此法避免了排序，達到線性時間。

方法二：二分搜尋 時間複雜度：O(n log n)，空間複雜度：O(1) — 先將陣列升序排序。對於候選答案 h（範圍 0 到 n），使用二分搜尋找到最大的 h，使得 citations[n-h] >= h。二分搜尋的範圍即為可能的 h 值。此法的實際效率與排序後線性掃描相近，但展示了二分搜尋的應用思路。

延伸思考

若研究員持續發表新論文且引用次數不斷更新，如何設計一個資料結構使得每次更新後能在 O(log n) 時間內查詢當前的 H-index？（提示：考慮使用排序集合或 BIT。）
LeetCode 275 題（H-Index II）假設 citations 陣列已排序，如何利用二分搜尋將時間複雜度優化為 O(log n)？
H-index 的定義中「至少有 h 篇論文各被引用至少 h 次」使用了兩個「至少」，若將其中一個「至少」改為「恰好」，定義會有何本質上的不同，計算難度又會如何改變？

解題說明

H-Index

複雜度分析

時間複雜度：O(n log n) — 排序需要 O(n log n)，之後的線性掃描為 O(n)，總體由排序決定。若使用計數排序（Counting Sort）版本，可降至 O(n)。

空間複雜度：O(1) — 排序版本只需常數額外空間（假設原地排序）。計數排序版本需要 O(n) 的計數陣列空間。

虛擬碼

Approach 1 - Sort + Linear Scan: 1. Sort citations in descending order 2. Initialize h = 0 3. For i from 0 to n-1: a. If citations[i] >= i+1: - Update h = i+1 (we have i+1 papers with at least i+1 citations each) b. Else: - Break (sorted order guarantees no further improvement) 4. Return h Approach 2 - Counting Sort (O(n)): 1. Create count array of size n+1, all zeros 2. For each citation c: count[min(c, n)]++ (cap at n since h <= n always) 3. Traverse from h=n down to 0, accumulate total papers 4. Return first h where total >= h

其他解法

延伸思考

若研究員持續發表新論文且引用次數不斷更新，如何設計一個資料結構使得每次更新後能在 O(log n) 時間內查詢當前的 H-index？（提示：考慮使用排序集合或 BIT。）
LeetCode 275 題（H-Index II）假設 citations 陣列已排序，如何利用二分搜尋將時間複雜度優化為 O(log n)？
H-index 的定義中「至少有 h 篇論文各被引用至少 h 次」使用了兩個「至少」，若將其中一個「至少」改為「恰好」，定義會有何本質上的不同，計算難度又會如何改變？