Hub
    Docs
Try for Free
xiangyi-li
/
OS-World
mirrored 12 minutes ago
Benchmark CardFiles and versionsLeaderboard
  • Hub
  • Contact
DiscordGitHubXLinkedIn
0
  1. mm_agents
  • README.md
    1.99 kB
    ​
  • __init__.py
    -
    ​
  • accessibility_tree_wrap
    -
    ​
  • agent.py
    47.3 kB
    ​
  • agi_agent.py
    7.33 kB
    ​
  • aguvis_agent.py
    23 kB
    ​
  • anthropic
    -
    ​
  • autoglm
    -
    ​
  • autoglm_v
    -
    ​
  • aworldguiagent
    -
    ​
  • coact
    -
    ​
  • dart_gui
    -
    ​
  • dart_gui_agent.py
    26.1 kB
    ​
  • evocua
    -
    ​
  • gpt54_agent.py
    24.4 kB
    ​
  • gta1
    -
    ​
  • gui_som
    -
    ​
  • hosted_gbox_agent.py
    6.52 kB
    ​
  • jedi_3b_agent.py
    17.9 kB
    ​
  • jedi_7b_agent.py
    17 kB
    ​
  • kimi
    -
    ​
  • llm_server
    -
    ​
  • maestro
    -
    ​
  • mano_agent.py
    42.6 kB
    ​
  • mobileagent_v3
    -
    ​
  • o3_agent.py
    9.14 kB
    ​
  • openai_cua_agent.py
    32.2 kB
    ​
  • opencua
    -
    ​
  • os_symphony
    -
    ​
  • owl_agent.py
    38 kB
    ​
  • prompts.py
    75.4 kB
    ​
  • qwen25vl_agent.py
    24 kB
    ​
  • qwen3vl_agent.py
    29.7 kB
    ​
  • seed_agent.py
    32.7 kB
    ​
  • uipath
    -
    ​
  • uipath_agent.py
    8.34 kB
    ​
  • uitars15_v1.py
    38.5 kB
    ​
  • uitars15_v2.py
    37.1 kB
    ​
  • uitars_agent.py
    30.3 kB
    ​
  • utils
    -
    ​
Add Llama3-70B Support (from Groq)
2 years ago
Modify the namespace of a11y tree (#62)
2 years ago
Add DuckTrack as initial annotation tool; Initial multimodal test
2 years ago
oswrold agent wrapper for trained v7 (#360)
5 months ago
optimize prompt
17 minutes ago
Dunjie LuMerge pull request #452 from xlang-ai/dev_djlu/gpt54_agent optimize gpt5.4 promptcda933f
Fix demo agent (PromptAgent) reset(): add vm_ip and kwargs for compatibility with lib_run_single.py (#388)
3 months ago
feat: add client password argument to multiple agents and scripts - Introduced `--client_password` argument in `run_multienv_aguvis.py`, `run_multienv_claude.py`, and `run_multienv_gta1.py` for enhanced security and flexibility. - Updated agent classes (`PromptAgent`, `AguvisAgent`, `GTA1Agent`) to accept and utilize `client_password` for improved configuration. - Modified evaluation guidelines to reflect the new client password requirement. - Ensured existing logic remains intact while enhancing functionality for better user experience.
7 months ago
feat: add client password argument to multiple agents and scripts - Introduced `--client_password` argument in `run_multienv_aguvis.py`, `run_multienv_claude.py`, and `run_multienv_gta1.py` for enhanced security and flexibility. - Updated agent classes (`PromptAgent`, `AguvisAgent`, `GTA1Agent`) to accept and utilize `client_password` for improved configuration. - Modified evaluation guidelines to reflect the new client password requirement. - Ensured existing logic remains intact while enhancing functionality for better user experience.
7 months ago
feat: add client password argument to multiple agents and scripts - Introduced `--client_password` argument in `run_multienv_aguvis.py`, `run_multienv_claude.py`, and `run_multienv_gta1.py` for enhanced security and flexibility. - Updated agent classes (`PromptAgent`, `AguvisAgent`, `GTA1Agent`) to accept and utilize `client_password` for improved configuration. - Modified evaluation guidelines to reflect the new client password requirement. - Ensured existing logic remains intact while enhancing functionality for better user experience.
7 months ago
EvoCUA Update (2025.01.05) (#412) * evocua init * setup max_token * evocua update --------- Co-authored-by: xuetaofeng <xuetaofeng@meituan.com> Co-authored-by: Tianbao Xie <47296835+Timothyxxx@users.noreply.github.com>
2 months ago
Add hosted GBOX agent for OSWorld evaluation (#376)
4 months ago
support opus4.6 (#437)
14 days ago
feat: refactor run_multienv_qwen25vl.py and qwen25vl_agent.py for improved logging and task management - Introduced signal handling for graceful shutdown of environments and processes. - Enhanced logging configuration to support dynamic log levels and structured output. - Updated argument parsing to include new parameters for model selection and task execution. - Refactored task distribution logic to streamline environment task management. - Improved error handling during task execution and environment cleanup. - Adjusted Qwen25VLAgent initialization to support new model and thought prefix options. - Reduced max tries for LLM calls to optimize performance.
8 months ago
support mano agent (#338) Co-authored-by: Fei Hu <molanhand@users.noreply.github.com>
6 months ago
add cogagent server
2 years ago
uipath v2 (#413) * submission v2 * small updates
2 months ago
uipath v2 (#413) * submission v2 * small updates
2 months ago
update aworldguiAgent code (#342)
6 months ago
Add support for GUI-Owl agent (#318) * add run_multienv_owl.py * add owl_agent.py
6 months ago
feat: add run_multienv_o3.py script for multi-environment evaluation - Introduced a new script `run_multienv_o3.py` to facilitate end-to-end evaluation across multiple environments. - Implemented command-line argument parsing for various configurations including environment settings, logging levels, and AWS parameters. - Integrated signal handling for graceful shutdown of environments and processes. - Enhanced logging capabilities for better traceability during execution. - Maintained existing logic from previous scripts while introducing new functionalities for improved evaluation processes.
7 months ago
support aliyun eval of qwen3vl
5 months ago
update coact: add autogen/cache
7 months ago
feat: update jedi agent with support for o3 as planner
7 months ago
feat: update jedi agent with support for o3 as planner
7 months ago
FIx corner cases (val connection in chrome when using playwright, and action parsing for agent, and accessibility tree xml handling)
2 years ago
feat/dart_gui (#371)
4 months ago
feat/dart_gui (#371)
4 months ago
fix(os_symphony):prompt (#402) * add_os_symphony * fix(os_symphony) * fix(os_symphony):prompt --------- Co-authored-by: Tianbao Xie <47296835+Timothyxxx@users.noreply.github.com>
2 months ago
init public release (#350)
5 months ago
fix #210: add a11y_tree support to UITARSAgent (#346)
6 months ago
Add autoglm-os-9b-v (#344) * update for autoglm-v * Update run_autoglm.py --------- Co-authored-by: hanyullai <hanyullai@outlook.com>
6 months ago
Update: seed agent
3 months ago
Add AutoGLM-OS agent (#309) * autoglm-os initialize * clean code * chore: use proxy for download setup * feat(autoglm-os): add parameter to toggle images * fix: use temporary directory for files pulled from the vm to prevent potential collision when running multiple instances of the same task in parallel * update * add client_password * update multienv * fix * fix prompt * fix prompt * fix prompt * fix sys prompt * feat: use proxy in file evaluator * fix client_password * fix note_prompt * fix autoglm agent cmd type * fix * revert: fix: use temporary directory for files pulled from the vm to prevent potential collision when running multiple instances of the same task in parallel reverts commit bab5473eea1de0e61b0e1d68b23ce324a5b0ee57 * feat(autoglm): setup tools * fix(autoglm): remove second time of get a11y tree * add osworld server restart * Revert "add osworld server restart" This reverts commit 7bd9d84122e246ce2a26de0e49c25494244c2b3d. * fix _launch_setup * fix autoglm agent tools & xml tree * fix desktop_env * fix bug for tool name capitalization * fix: always use proxy for setup download * add fail after exceeding max turns * fix(autoglm): avoid adding image to message when screenshot is empty * fix maximize_window * fix maximize_window * fix maximize_window * fix import browsertools module bug * fix task proxy config bug * restore setup * refactor desktop env * restore image in provider * restore file.py * refactor desktop_env * quick fix * refactor desktop_env.step * fix our env reset * add max truns constraint * clean run script * clean lib_run_single.py --------- Co-authored-by: hanyullai <hanyullai@outlook.com> Co-authored-by: JingBh <jingbohao@yeah.net>
7 months ago
Kimi k25 (#428) * kimi k2.5 agent
a month ago
support_qwen25vl (#276) Co-authored-by: root <ludunjie1219@github.com>
8 months ago
OpenCUA-72B (#354) * use aws pub ip * os task fix: set the default dim screen time to be 300s * OpenCUA-72B * update password * update * update * update opencua72b agent * change provider ip --------- Co-authored-by: Jiaqi <dengjiaqi@moonshot.cn>
5 months ago
add support for mobile agent v3 (#328) * add support for mobile agent v3 * add mobile_agent * add support for mobile agent v3
6 months ago
Uitars/dev (#291) * use aws pub ip * os task fix: set the default dim screen time to be 300s * add all the uitars agents: 1. run_multienv_uitars.py: Qwen2VL-based UITARS models 2. run_multienv_uitars15_v1.py: UITARS1.5-7B 3. run_multienv_uitars15_v2.py: SeedVL1.5 thining/non-thinking --------- Co-authored-by: Jiaqi <dengjiaqi@moonshot.cn>
7 months ago
Uitars/dev (#291) * use aws pub ip * os task fix: set the default dim screen time to be 300s * add all the uitars agents: 1. run_multienv_uitars.py: Qwen2VL-based UITARS models 2. run_multienv_uitars15_v1.py: UITARS1.5-7B 3. run_multienv_uitars15_v2.py: SeedVL1.5 thining/non-thinking --------- Co-authored-by: Jiaqi <dengjiaqi@moonshot.cn>
7 months ago
Add multiple new modules and tools to enhance the functionality and extensibility of the Maestro project (#333) * Added a **pyproject.toml** file to define project metadata and dependencies. * Added **run\_maestro.py** and **osworld\_run\_maestro.py** to provide the main execution logic. * Introduced multiple new modules, including **Evaluator**, **Controller**, **Manager**, and **Sub-Worker**, supporting task planning, state management, and data analysis. * Added a **tools module** containing utility functions and tool configurations to improve code reusability. * Updated the **README** and documentation with usage examples and module descriptions. These changes lay the foundation for expanding the Maestro project’s functionality and improving the user experience. Co-authored-by: Hiroid <guoliangxuan@deepmatrix.com>
6 months ago