OS-World/xiangyi-li · BenchFlow

__init__.py
1.65 kB
calc.py
522 B
chrome.py
135 kB
file.py
5.46 kB
general.py
1.25 kB
gimp.py
1.11 kB
impress.py
7.03 kB
info.py
1.5 kB
misc.py
24 kB
replay.py
709 B
vlc.py
3.69 kB
vscode.py
1.08 kB

__init__.py
1.65 kB
Fix chrome dark-mode task evaluation for appearance settings
24 days ago
calc.py
522 B
Clean code; Refactor environment to pass screenshot content instead of path
2 years ago
chrome.py
135 kB
fix(chrome): recreation.gov getter timeouts for search result and new page (#438) - Search result: wait for search URL and domcontentloaded before looking for .search-result-highlight--success; add fallback to attached + scroll into view so the element is found when visible but not yet "visible" to Playwright. - New page: use wait_for_load_state(load) instead of networkidle so the popup is considered ready once the load event fires; recreation.gov keeps background requests so networkidle often never fires and caused 60s timeouts. Tested with a full run.
13 days ago
file.py
5.46 kB
fix: Enhance error handling and logging across multiple evaluators - Added logging for file retrieval and error handling in file.py, improving robustness during file operations. - Implemented checks for file existence and parsing errors in general.py, enhancing reliability in JSON/YAML processing. - Improved table comparison logic in table.py with detailed error logging for sheet loading and cell value reading. - Enhanced metrics evaluation in slides.py with additional checks for paragraph and run counts, ensuring thorough comparison. - Updated utils.py to include file existence checks and detailed error logging during cell value reading.
8 months ago
general.py
1.25 kB
Support Docker VM manager and provider (#75) * Add docker provider framework * Update VM download link * Add stop container * Update docker manager & provider * Update * Update * Update provider
a year ago
gimp.py
1.11 kB
Fix minor errors in vscode and gimp about path and postconfig
2 years ago
impress.py
7.03 kB
update multi-apps
2 years ago
info.py
1.5 kB
feat: enhance VM wallpaper retrieval and image similarity checks - Added logging to the VM wallpaper retrieval function to capture errors and warnings related to content retrieval and file creation. - Implemented checks for None, empty, and invalid content types to ensure robustness in wallpaper handling. - Enhanced the SSIM structure check function with size validation and improved error handling for image processing. - Added logging for image size discrepancies and exceptions during SSIM computation to aid in debugging. These changes improve error handling and logging, ensuring better maintainability and reliability of the evaluators.
8 months ago
misc.py
24 kB
Updated misc:get_rule_relativeTime to support list in relativeRules[expected][time] (#447)
5 days ago
replay.py
709 B
Finish loading the vscode examples v1; Improve on the infra: Add accessibility tree into the observation; Add activate window function, etc
2 years ago
vlc.py
3.69 kB
[Feature] Initialize and Implement Aguvis Evaluation on OSWorld (#98) * Initialize Aguvis eval on OSWorld * Debug * Debug * v1, internal version * Add experiments script * Fix minor bugs * Update new endpoint * Update ip * Update * Update * Update * Update * Update * Update * Update * Update * Fix model name * Fix docker close issues; update prompting * Fix missed * Fix the default port to avoid crashing on examples like '_update_browse_history_setup' * Fix server and chromium ports in setup * Revert and add missed dependency * Add VLC port for docker * Update * Clean --------- Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local> Co-authored-by: FredWuCZ <fredwucz@outlook.com>
a year ago
vscode.py
1.08 kB
add multi-app examples
2 years ago