Leetcode SQL: 1661. Average Time of Process per Machine

SQL Code:

SELECT machine_id, 
       ROUND(AVG(end_time - start_time), 3) AS processing_time
FROM (
    SELECT machine_id, 
           process_id,
           MAX(CASE WHEN activity_type = 'end' THEN timestamp END) AS end_time,
           MAX(CASE WHEN activity_type = 'start' THEN timestamp END) AS start_time
    FROM Activity
    GROUP BY machine_id, process_id
) AS process_times
GROUP BY machine_id;

Explanation:

  1. Subquery:

    • We use a subquery to calculate the start and end times for each machine_id and process_id pair.
    • The MAX(CASE WHEN activity_type = 'end' THEN timestamp END) retrieves the end timestamp for each process, and MAX(CASE WHEN activity_type = 'start' THEN timestamp END) retrieves the start timestamp.
    • The GROUP BY machine_id, process_id ensures that we group the rows by both machine_id and process_id, which are unique pairs as described in the schema.

    Subquery Explanation in Chinese:

    • 我们使用子查询来计算每个 machine_idprocess_id 对应的开始和结束时间。
    • MAX(CASE WHEN activity_type = 'end' THEN timestamp END) 获取每个过程的 end 时间戳,MAX(CASE WHEN activity_type = 'start' THEN timestamp END) 获取 start 时间戳。
    • GROUP BY machine_id, process_id 确保我们按 machine_idprocess_id 分组,这些是唯一的组合。
  2. SELECT machine_id, ROUND(AVG(end_time – start_time), 3) AS processing_time:

    • After obtaining the start and end times for each process, we calculate the difference (end_time - start_time) for each process to get the process duration.
    • The AVG() function is used to calculate the average processing time for each machine.
    • We use ROUND() to round the result to 3 decimal places, as required by the problem.
    • Finally, we group the results by machine_id to get the average processing time for each machine.

    Explanation in Chinese:

    • 获取每个过程的开始和结束时间后,我们计算每个过程的持续时间 (end_time - start_time)。
    • 使用 AVG() 函数来计算每台机器的平均处理时间。
    • 使用 ROUND() 将结果保留到小数点后 3 位。
    • 最后,按 machine_id 分组,获取每台机器的平均处理时间。
  3. GROUP BY machine_id:

    • The final GROUP BY machine_id ensures that the average processing time is calculated for each machine individually.

    Explanation in Chinese:

    • 最后的 GROUP BY machine_id 确保我们为每台机器分别计算平均处理时间。

Key Concepts:

  • Subquery with Aggregation:
    In the subquery, we use MAX() with a CASE WHEN clause to extract the start and end timestamps from the Activity table. This ensures that we get both timestamps for each process_id and machine_id.

    带聚合的子查询:
    在子查询中,我们使用 MAX()CASE WHEN 子句从 Activity 表中提取 startend 时间戳。这确保我们获取每个 process_idmachine_id 的两个时间戳。

  • Time Difference:
    The time difference (end_time - start_time) gives us the duration for each process, and averaging this difference provides the average processing time for each machine.

    时间差:
    时间差 (end_time - start_time) 提供每个过程的持续时间,计算此差值的平均值可以得到每台机器的平均处理时间。

  • ROUND():
    We use the ROUND() function to round the result to 3 decimal places, as required.

    ROUND() 函数:
    使用 ROUND() 函数将结果四舍五入到小数点后 3 位。

Warnings:

  • Consistency in Time Units:
    Ensure that the timestamp values in the Activity table are consistent in terms of time units (e.g., seconds). Since the question specifies timestamp as a float representing seconds, the calculation should proceed without issues.

    时间单位的一致性:
    确保 Activity 表中的 timestamp 值的时间单位一致(例如,秒)。问题中指定 timestamp 是表示秒的浮点数,因此计算应该不会有问题。

Summary:

This query calculates the average time each machine takes to complete a process by calculating the time difference between end and start timestamps for each process and then averaging these values for each machine_id. The result is rounded to 3 decimal places, as required.

中文总结:
该查询通过计算每个过程的 endstart 时间戳之间的时间差来计算每台机器完成一个过程的平均时间,然后为每个 machine_id 取这些值的平均值。结果保留到小数点后三位。

Tips:

  • Indexing for Performance:
    Consider adding indexes on machine_id and process_id to optimize the performance of this query, especially if the Activity table is large.

    性能优化的索引:
    考虑在 machine_idprocess_id 列上创建索引,以优化查询性能,特别是在 Activity 表很大的情况下。

Similar SQL Problems:

  1. LeetCode 176: Second Highest Salary
    This problem involves aggregation functions, similar to how the AVG() function is used here.

  2. LeetCode 181: Employees Earning More Than Their Managers
    This problem involves comparing data between related entities, similar to how we compare start and end timestamps.

  3. LeetCode 183: Customers Who Never Order
    This problem filters data based on specific conditions, similar to how we group and calculate the average time in this query.

  4. LeetCode 196: Delete Duplicate Emails
    This problem involves identifying and filtering rows, similar to how we filter start and end times to compute the process duration.

  5. LeetCode 184: Department Highest Salary
    This problem involves calculating aggregated values, similar to how the AVG() function is used to calculate the average process time.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *