Grading Rubric (Group Assignment)

Assessment of Learning Goal 1: Understand how to design experiments using machine learning pipelines to help stakeholders make sense of structured data (45%)

Excellent (9-10)
- The distribution of the data is calculated in a thorough manner, using a variety of relevant measures and/or visualizations, and explained in a clear and detailed way.
- All relevant features are identified and preprocessed in a thoughtful and comprehensive way, with clear explanations for each change made.
- The decision to change or not change the label of the dataset is justified and explained in detail.
- The observations regarding the relationship between feature importance and feature statistical analysis are insightful and well-supported with evidence, and the specific example provided illustrates this relationship in a clear and compelling way.
Good (7-8)
- The distribution of the data is calculated using relevant measures and/or visualizations, and explained in a clear and detailed way.
- All relevant features are identified and preprocessed in a thoughtful way, with clear explanations for each change made.
- The decision to change or not change the label of the dataset is justified and explained.
- The observations regarding the relationship between feature importance and feature statistical analysis are insightful and supported with evidence, and the specific example provided illustrates this relationship in a clear way.
Sufficient (6)
- The distribution of the data is calculated using relevant measures and/or visualizations, and explained in a somewhat clear way.
- Most relevant features are identified and preprocessed, with some explanations for each change made.
- The decision to change or not change the label of the dataset is explained.
- The observations regarding the relationship between feature importance and feature statistical analysis are somewhat insightful and supported with some evidence, and the specific example provided illustrates this relationship in a somewhat clear way.
Insufficient (<6)
- The distribution of the data is not calculated or not explained clearly.
- Few relevant features are identified and preprocessed, with little to no explanation for each change made.
- The decision to change or not change the label of the dataset is not explained.
- The observations regarding the relationship between feature importance and feature statistical analysis are not insightful and/or not supported with evidence, and the specific example provided does not clearly illustrate this relationship.

Assessment of Learning Goal 2: Quality of critically reflecting the experiment and supporting the findings with evidence (45%)

Excellent (9-10)
- Provide a detailed description of multiple models used for fitting the dataset and explain the adjustments made to hyper-parameters in a convincing manner
- Explain the selection of appropriate evaluation metrics with detailed reasoning
- Show evidence of using multiple validation methods like cross-validation and random splitting and provide a detailed explanation for selecting them
- Describe the performance of the model on additional datasets with proper analysis and explanation for any variations in performance
Good (7-8)
- Provide a description of the models used for fitting the dataset and explain the adjustments made to hyper-parameters in a reasonable manner
- Explain the selection of appropriate evaluation metrics with proper reasoning
- Show evidence of using some validation methods like cross-validation or random splitting and provide a reasonable explanation for selecting them
- Describe the performance of the model on additional datasets with reasonable analysis and explanation for any variations in performance
Sufficient (6)
- Provide a brief description of the models used for fitting the dataset and explain the adjustments made to hyper-parameters, but the explanation may not be convincing
- Explain the selection of some evaluation metrics with some reasoning
- Show some evidence of using validation methods but the explanation may not be convincing
- Describe the performance of the model on additional datasets with some analysis, but the explanation for any variations in performance may be insufficient
Insufficient (<6)
- Provide no or very limited description of the models used for fitting the dataset and the explanation for the adjustments made to hyper-parameters may be absent or inadequate
- No explanation or limited explanation for the selection of evaluation metrics
- No evidence or inadequate evidence of using validation methods and the explanation may be absent or inadequate
- No description or inadequate description of the performance of the model on additional datasets and the explanation for any variations in performance may be absent or inadequate.

Assessment of Learning Goal 3: Ability to automate the experiment (weight 10%)

Excellent (9-10)
- The experiment is fully automated, has good documentation about how the code works, and has very good code quality.
Good (7-8)
- The experiment is fully automated, has reasonable documentation about how the code works, and has the code be mostly human-readable.
Sufficient (6)
- Some parts of the experiment are automated but some parts require human effort. Have some documentation about how the code works, but some may not be clear. Some parts of the code are hard to understand.
Insufficient (<6)
- The experiment is not automated, or some parts are automated with no documentation and poor code quality.