Question on how to get run score for test dataset

## ❓ Questions and Help

Hello team, thanks for creating this wonderful toolset for data science automation. I am currently creating a custom dataset and use it to train a binary classification model. 

- The command i am using: rdagent data_science --competition <task_name>
- I am using similar data structure as arf-12-hours-prediction-task in the tutorial. And I have also provided the test dataset, in the following files hierarchy:
```
git_ignore_folder/ds_data
├── eval
│   ├── <task_name>
│       ├── grade.py
│       ├── submission_test.csv
│       └── valid.py
├── <task_name>
│   ├── description.md
│   ├── sample.py
│   ├── sample_submission.csv
│   ├── test
│   │   ├── info.csv
│   │   └── X.npy
│   └── train
│       ├── info.csv
│       └── X.npy
```
- I can successfully trigger the task, but when I look at the tracker UI, it seems the run score (test) is never executed. Please see the snapshot. 

<img width="1577" height="485" alt="Image" src="https://github.com/user-attachments/assets/fec3429c-7edb-48fe-bd47-37031a50f6a8" />

- What have i missed? the env file is as follows:
```
# ==========================================
# Task Configuration
# ==========================================
DS_LOCAL_DATA_PATH="git_ignore_folder/ds_data"
DS_CODER_ON_WHOLE_PIPELINE=True
DS_CODER_COSTEER_ENV_TYPE=docker
DS_IF_USING_MLE_DATA=False
DS_SAMPLE_DATA_BY_LLM=False
DS_SCEN=rdagent.scenarios.data_science.scen.DataScienceScen
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Question on how to get run score for test dataset #1292

❓ Questions and Help

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Question on how to get run score for test dataset #1292

Description

❓ Questions and Help

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions