Control Limits Calculation Details
1. Summary
- Applies to Sample Types with CustomerID= 8675309 (TSMC).
- Control limits are calculated based on:
- TSMC 2025 Supplier Quality Request.docx, Q-IQC-02-03-095 / Version 1.7
- Section 4.4 Supplier SPC Request (version: 1.5)
- Only parameters being sent on eCOA are being monitored.
- All of these parameters are considered to be KEY as per the context of the TSMC document mentioned above.
- The process to auto calculate CL (Control Limits) is running every day at 1 AM CST.
- New calculation is being done based on:
- Parameter CL next calculation is past due.
- Parameter has never had CL calculated before.
- Manual trigger from power apps dashboard.
2. Frequency of CL Calculation
- For parameters where there are less than 4 data points, no CL calculation is currently being done.
- NOTE: TSMC document states to Follow NMSE control line.
- For parameters where there are 4 batches or more:
- CL calculation is done every:
- 3 months: If the first data point date (mfg date) is within 1 year from the current date.
- 12 months: if the first data point (mfg date) is more than a year from the current date.
- CL calculation is done every:
3. Rolling Data Pool
Each time a calculation is performed for a parameter, the data points used are 2 years back from the date of calculation.
4. Outliers Screening
Normal: data points over 4.5 sigma are removed from CL calculation.
Skewed: a yeo-Johnson transformation is applied, and then data points over 4.5 sigma are removed from CL calculation.
the Yeo-Johnson transformation is often considered a more versatile and robust option than the standard Box-Cox transformation used in Minitab, particularly for outlier detection in skewed datasets.
Other distribution types: All data is used for CL calculation.
OOS: no spec validation is being done. Since data is coming from approved samples all test results are being considered within spec limits.
5. Distribution Type
If a distribution was selected manually through power apps this section does not apply.
When CL next calculation date is past due and a CL calculation is triggered again, an Auto distribution determination is always performed regardless if a manual distribution was selected before.
If a new parameter gets picked up by this process, the program will determine automatically a distribution type. After the parameter shows in the power app dashboard, a different distribution can be selected manually and a new calculation must be triggered.
The distribution type is determined automatically by using different algorithms.
The data set passes through the following tests, in this same order, until a distribution type is matched by the program:
5.1. Constant
Test: Theoretically when the variance of the data set is zero, the data points are all identical. Taking into account floating-point rounding errors, if the resulting variance is smaller than a certain threshold, the data set is treated as having no meaningful variation (near-zero variance), so it is considered being constant. The standard convention of the threshold is the square root of the machines epsilon (approximately 1.49x10^-8).
5.2. Near-Constant
Test: Percentage of the most frequent value (the mode's relative frequency) in the dataset. If this value is greater than 0.95, the data set is considered as being Near-Constant.
5.3. Categorical
Test: Count of unique values in the data set. If the count is 1 or 2, the data set is considered as being categorical.
5.4. Multimodal
Test: Fit Gaussian Mixture Models (GMMs) to the data set using the Expectation-Maximization (EM) algorithm. Test and compare models with 1, 2, and 3 clusters (mixture components) and determine G (number of clusters in the best-fitting model). If G > 1 then the data set is considered to being multimodal.
5.5. Skewed
Test: Ratio of the third central moment to the cube of the standard deviation. If the absolute value of this ratio is greater than 0.5 then the data set is considered being skewed.
5.6. Normal
Test: Perform the Shapiro-Wilk test for normality. If the p-value is greater or equal to 0.5 than the data set is considered being Normal.
5.7. Undetermined
If the data set does not fall into any of the above, then it is considered to be Undetermined.
6. Control Limit Calculation
This is based depending on the distribution type of the data.
6.1. Normal
Outliers are removed using the the method mentioned in Outliers Screening above.
SD = Standard deviation using the Average Moving Range method (same as in Minitab).
CL = Mean of data set
LCL = CL - 3 * SD
UCL = CL + 3 *SD
6.2. Skewed
Outliers are removed using the the method mentioned in Outliers Screening above.
SD = Standard deviation using the Average Moving Range method (same as in Minitab).
CL = Median of data set
LCL = CL - 3 * S_ROBUST_L
UCL = CL + 3 * S_ROBUST_U
| Number of Data Points | S_ROBUST_L | S_ROBUST_U |
|---|---|---|
| 4 - 100 | ( P50 - P05 ) / 1.645 | ( P95 - P50 ) / 1.645 |
| 101 - 300 | ( P50 - P03 ) / 1.881 | ( P97 - P50 ) / 1.881 |
| 301 - 3000 | ( P50 - P01 ) / 2.326 | ( P99 - P50 ) / 2.326 |
| 3001 - 10000 | ( P50 - P0.5 ) / 2.576 | ( P99.5 - P50 ) / 2.576 |
| > 10000 | ( P50 - P0.1 ) / 3.09 | ( P99.9 - P50 ) / 3.09 |
PXX: XX th percentile, ex. P99 is 99th percentile
6.3. Constant
CL = P50 (same as Median)
LCL = CL
UCL = CL
6.4. Near-Constant / Categorical / Multimodal / Undetermined
Using: Empirical Cumulative Distribution Function (eCDF)
CL = Median
LCL = calculated as the percentile corresponding to a -3 sigma threshold in a normal distribution.
UCL = calculated as the percentile corresponding to a 3 sigma threshold in a normal distribution.
using 3 sigma because we are considering all parameters as key (Summary: 3.a)
0 Comments
Add your comment