Here we put the data examples to benchmark the ability of agents when interacting with GUI. The examples are stored in ./examples where each data item formatted as ...
"instruction": "Could you open VLC and start playing this exact HLS URL as a network stream, making sure it\u2019s actually playing (not paused)? https://devstreaming ...