SQL Code:
SELECT machine_id,
ROUND(AVG(end_time - start_time), 3) AS processing_time
FROM (
SELECT machine_id,
process_id,
MAX(CASE WHEN activity_type = 'end' THEN timestamp END) AS end_time,
MAX(CASE WHEN activity_type = 'start' THEN timestamp END) AS start_time
FROM Activity
GROUP BY machine_id, process_id
) AS process_times
GROUP BY machine_id;
Explanation:
-
Subquery:
- We use a subquery to calculate the start and end times for each
machine_id
andprocess_id
pair. - The
MAX(CASE WHEN activity_type = 'end' THEN timestamp END)
retrieves theend
timestamp for each process, andMAX(CASE WHEN activity_type = 'start' THEN timestamp END)
retrieves thestart
timestamp. - The
GROUP BY machine_id, process_id
ensures that we group the rows by bothmachine_id
andprocess_id
, which are unique pairs as described in the schema.
Subquery Explanation in Chinese:
- 我们使用子查询来计算每个
machine_id
和process_id
对应的开始和结束时间。 MAX(CASE WHEN activity_type = 'end' THEN timestamp END)
获取每个过程的end
时间戳,MAX(CASE WHEN activity_type = 'start' THEN timestamp END)
获取start
时间戳。GROUP BY machine_id, process_id
确保我们按machine_id
和process_id
分组,这些是唯一的组合。
- We use a subquery to calculate the start and end times for each
-
SELECT machine_id, ROUND(AVG(end_time – start_time), 3) AS processing_time:
- After obtaining the start and end times for each process, we calculate the difference (
end_time - start_time
) for each process to get the process duration. - The
AVG()
function is used to calculate the average processing time for each machine. - We use
ROUND()
to round the result to 3 decimal places, as required by the problem. - Finally, we group the results by
machine_id
to get the average processing time for each machine.
Explanation in Chinese:
- 获取每个过程的开始和结束时间后,我们计算每个过程的持续时间 (
end_time - start_time
)。 - 使用
AVG()
函数来计算每台机器的平均处理时间。 - 使用
ROUND()
将结果保留到小数点后 3 位。 - 最后,按
machine_id
分组,获取每台机器的平均处理时间。
- After obtaining the start and end times for each process, we calculate the difference (
-
GROUP BY machine_id:
- The final
GROUP BY machine_id
ensures that the average processing time is calculated for each machine individually.
Explanation in Chinese:
- 最后的
GROUP BY machine_id
确保我们为每台机器分别计算平均处理时间。
- The final
Key Concepts:
-
Subquery with Aggregation:
In the subquery, we useMAX()
with aCASE WHEN
clause to extract thestart
andend
timestamps from theActivity
table. This ensures that we get both timestamps for eachprocess_id
andmachine_id
.带聚合的子查询:
在子查询中,我们使用MAX()
和CASE WHEN
子句从Activity
表中提取start
和end
时间戳。这确保我们获取每个process_id
和machine_id
的两个时间戳。 -
Time Difference:
The time difference (end_time - start_time
) gives us the duration for each process, and averaging this difference provides the average processing time for each machine.时间差:
时间差 (end_time - start_time
) 提供每个过程的持续时间,计算此差值的平均值可以得到每台机器的平均处理时间。 -
ROUND():
We use theROUND()
function to round the result to 3 decimal places, as required.ROUND() 函数:
使用ROUND()
函数将结果四舍五入到小数点后 3 位。
Warnings:
-
Consistency in Time Units:
Ensure that thetimestamp
values in theActivity
table are consistent in terms of time units (e.g., seconds). Since the question specifiestimestamp
as a float representing seconds, the calculation should proceed without issues.时间单位的一致性:
确保Activity
表中的timestamp
值的时间单位一致(例如,秒)。问题中指定timestamp
是表示秒的浮点数,因此计算应该不会有问题。
Summary:
This query calculates the average time each machine takes to complete a process by calculating the time difference between end
and start
timestamps for each process and then averaging these values for each machine_id
. The result is rounded to 3 decimal places, as required.
中文总结:
该查询通过计算每个过程的 end
和 start
时间戳之间的时间差来计算每台机器完成一个过程的平均时间,然后为每个 machine_id
取这些值的平均值。结果保留到小数点后三位。
Tips:
-
Indexing for Performance:
Consider adding indexes onmachine_id
andprocess_id
to optimize the performance of this query, especially if theActivity
table is large.性能优化的索引:
考虑在machine_id
和process_id
列上创建索引,以优化查询性能,特别是在Activity
表很大的情况下。
Similar SQL Problems:
-
LeetCode 176: Second Highest Salary
This problem involves aggregation functions, similar to how theAVG()
function is used here. -
LeetCode 181: Employees Earning More Than Their Managers
This problem involves comparing data between related entities, similar to how we comparestart
andend
timestamps. -
LeetCode 183: Customers Who Never Order
This problem filters data based on specific conditions, similar to how we group and calculate the average time in this query. -
LeetCode 196: Delete Duplicate Emails
This problem involves identifying and filtering rows, similar to how we filterstart
andend
times to compute the process duration. -
LeetCode 184: Department Highest Salary
This problem involves calculating aggregated values, similar to how theAVG()
function is used to calculate the average process time.
Leave a Reply