Hub
    Docs
Try for Free
xiangyi-li
/
webarena
mirrored 13 minutes ago
Benchmark CardFiles and versionsLeaderboard
  • Hub
  • Contact
DiscordGitHubXLinkedIn
1
  1. /
  2. test_evaluation_harness
  3. tests
  4. configs
  • func_eval_fail.json
    841 B
    ​
  • func_eval_success.json
    842 B
    ​
  • func_url_func_1.json
    686 B
    ​
  • func_url_func_2.json
    977 B
    ​
  • html_content_element_exact_match.json
    864 B
    ​
  • html_content_exact_match.json
    783 B
    ​
  • html_content_url_comb.json
    922 B
    ​
  • string_match.json
    478 B
    ​
  • url_exact_match.json
    525 B
    ​
update evaluators to match the new config format
3 years ago
update evaluators to match the new config format
3 years ago
update evaluators to match the new config format
3 years ago
update evaluators to match the new config format
3 years ago
update evaluators to match the new config format
3 years ago
update evaluators to match the new config format
3 years ago
update evaluators to match the new config format
3 years ago
update test example due to html escape
3 years ago
Shuyan ZhouUpdate README.md3e5c8f9
Update tests configs to fit the current settings
3 years ago