Setup
You should use Vivaria to develop tasks. You can find setup instructions for Vivaria here.
Once you’ve installed Vivaria, head over to our example tasks on GitHub. These examples demonstrate how to develop a basic task.
Running a task
You should follow the instructions for uploading your task to Vivaria.
When the agent completes a task, it submits a final answer for scoring. You can simulate this by writing to the file /home/agent/submission.txt
.
echo "3580245" > /home/agent/submission.txt
Go back to the first terminal to see the final score for the task. If you gave the correct answer, it should be 1.0!
instructions.txt
or submission.txt
in your task. Instead, write instructions in TaskFamily.get_instructions
and advise the agent to “submit” its solution. The agent will know how to do this.Testing a task
A task family may contain one or more automated tests (e.g. test_my_task.py
) that use the pytest framework. You can run these using the same method as running the task, except you should run viv task test
instead of viv task start
.