Table: Queries
| Column Name | Type |
| query_name | varchar |
| result | varchar |
| position | int |
| rating | int |
This table may have duplicate rows.
This table contains information collected from some queries on a database.
The position column has a value from 1 to 500.
The rating column has a value from 1 to 5. Query with rating less than 3 is a poor query.
We define query quality as:
The average of the ratio between query rating and its position.
We also define poor query percentage as:
The percentage of all queries with rating less than 3.
Write a solution to find each query_name, the quality and poor_query_percentage.
Both quality and poor_query_percentage should be rounded to 2 decimal places.
Return the result table in any order.
Queries table:
| query_name | result | position | rating |
| Dog | Golden Retriever | 1 | 5 |
| Dog | German Shepherd | 2 | 5 |
| Dog | Mule | 200 | 1 |
| Cat | Shirazi | 5 | 2 |
| Cat | Siamese | 3 | 3 |
| Cat | Sphynx | 7 | 4 |
| query_name | quality | poor_query_percentage |
| Dog | 2.50 | 33.33 |
| Cat | 0.66 | 33.33 |
Dog queries quality is ((5 / 1) + (5 / 2) + (1 / 200)) / 3 = 2.50
Dog queries poor_ query_percentage is (1 / 3) * 100 = 33.33
Cat queries quality equals ((2 / 5) + (3 / 3) + (4 / 7)) / 3 = 0.66
Cat queries poor_ query_percentage is (1 / 3) * 100 = 33.33
문제 풀이
import pandas as pd
def queries_stats(queries: pd.DataFrame) -> pd.DataFrame:
queries['quality'] = queries['rating'] / queries['position'] + 1e-6 # ZeroDivisionError 방지
queries['poor_query_percentage'] = queries['rating'].apply(lambda x: 100 if x < 3 else 0) # rating이 3보다 작을 경우 poor
# query_name별 quality, poor_query_percentage 평균
return queries.groupby('query_name').agg({'quality':'mean', 'poor_query_percentage':'mean'}).round(2).reset_index()
파이썬을 독학하시는 분들에게 도움이 되길 바라며,
혹 더 좋은 방법이 있거나 오류가 있다면 편하게 말씀 부탁드립니다.