Upload a submission JSON file to compare against ground-truth solutions. Scoring uses the Kaggle metric: a test case scores 1 if either attempt exactly matches the solution. Supports both ARC-AGI 1 and ARC-AGI 2 datasets.